【问题标题】：Is there any simple way to benchmark Python script?有没有简单的方法来对 Python 脚本进行基准测试？
【发布时间】：2010-12-08 06:13:25
【问题描述】：

通常我使用 shell 命令time。我的目的是测试数据是小、中、大还是非常大的集合，需要多少时间和内存。

任何适用于 Linux 或仅 Python 的工具可以做到这一点？

【问题讨论】：

标签： python unix shell benchmarking

【解决方案1】：

查看timeit、the python profiler 和pycallgraph。还要确保查看the comment below by nikicc 提及“SnakeViz”。它为您提供了另一种分析数据的可视化，这可能会有所帮助。

时间

def test():
    """Stupid test function"""
    lst = []
    for i in range(100):
        lst.append(i)

if __name__ == '__main__':
    import timeit
    print(timeit.timeit("test()", setup="from __main__ import test"))

    # For Python>=3.5 one can also write:
    print(timeit.timeit("test()", globals=locals()))

本质上，您可以将python代码作为字符串参数传递给它，它会在指定的次数内运行并打印执行时间。来自docs 的重要信息：

timeit.timeit(stmt='pass', setup='pass', timer=<default timer>, number=1000000, globals=None) 使用给定的语句 setup 创建一个 Timer 实例代码和 timer 函数并运行其timeit 方法 number 次处决。可选的 globals 参数指定执行代码的命名空间。

...和：

Timer.timeit(number=1000000) 主语句的执行时间 number。这将执行设置语句一次，然后返回执行 main 所需的时间语句多次，以秒为单位测量为浮点数。参数是循环的次数，默认为一百万。主语句、设置语句和定时器函数要使用的传递给构造函数。

注意： 默认情况下，timeit 在计时期间会暂时关闭garbage collection。这种方法的优点是它使独立时间更具可比性。这个缺点是 GC可能是性能的重要组成部分被测量的功能。如果是这样，可以重新启用 GC 作为第一个 setup 字符串中的语句。例如：

timeit.Timer('for i in xrange(10): oct(i)', 'gc.enable()').timeit()

分析

分析将使您更详细地了解正在发生的事情。这是来自the official docs 的“即时示例”：

import cProfile
import re
cProfile.run('re.compile("foo|bar")')

这会给你：

      197 function calls (192 primitive calls) in 0.002 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000    0.001    0.001 <string>:1(<module>)
     1    0.000    0.000    0.001    0.001 re.py:212(compile)
     1    0.000    0.000    0.001    0.001 re.py:268(_compile)
     1    0.000    0.000    0.000    0.000 sre_compile.py:172(_compile_charset)
     1    0.000    0.000    0.000    0.000 sre_compile.py:201(_optimize_charset)
     4    0.000    0.000    0.000    0.000 sre_compile.py:25(_identityfunction)
   3/1    0.000    0.000    0.000    0.000 sre_compile.py:33(_compile)

这两个模块都应该让您了解在哪里寻找瓶颈。

另外，要掌握profile 的输出，请查看this post

pycallgraph

注意 pycallgraph 已被正式废弃since Feb. 2018。不过，截至 2020 年 12 月，它仍在使用 Python 3.6。只要 python 公开分析 API 的方式没有核心变化，它应该仍然是一个有用的工具。

This module 使用 graphviz 创建如下调用图：

您可以通过颜色轻松查看哪些路径使用时间最长。您可以使用 pycallgraph API 或使用打包脚本创建它们：

pycallgraph graphviz -- ./mypythonscript.py

虽然开销相当大。因此，对于已经长时间运行的流程，创建图表可能需要一些时间。

【讨论】：

如果使用 cProfile，还有一个选项可以分析整个脚本并将结果保存到带有python -m cProfile -o results.prof myscript.py 的文件中。然后，可以通过名为SnakeViz 的程序使用snakeviz results.prof 在浏览器中很好地呈现输出文件。
pycallgraph 的最后一个版本是in 2013，它已被正式废弃since 2018
@Boris 很高兴知道。我昨天实际上已经使用了它 - 至少现在 - 它仍然有效。我会更新帖子。谢谢你的信息。
我使用pip install pycallgraph 安装了pycallgraph。如果我在脚本上运行上面的命令，我会收到以下错误 'pycallgraph' is not recognized as an internal or external command, operable program or batch file. 知道为什么会这样吗？
运行pip install 时，它将在需要位于PATH 环境变量中的特定文件夹中创建可执行文件。这取决于您的 Python 安装。我建议将此类工具安装到您的虚拟环境中，或通过pipx。 pipx 仍然要求您在PATH 上拥有适当的文件夹，但总体上可以更轻松地管理可执行文件。

【解决方案2】：

我使用一个简单的装饰器来计时 func

import time

def st_time(func):
    """
        st decorator to calculate the total time of a func
    """

    def st_func(*args, **keyArgs):
        t1 = time.time()
        r = func(*args, **keyArgs)
        t2 = time.time()
        print("Function=%s, Time=%s" % (func.__name__, t2 - t1))
        return r

    return st_func

【讨论】：

当然是 print "Function=%s, Time=%s" %(func.__name__, t2 - t1)。谢谢，真方便
您能否解释一下这种方法的工作原理和使用方法？
这比 timeit 使用起来更直观

【解决方案3】：

timeit 模块又慢又奇怪，所以我写了这个：

def timereps(reps, func):
    from time import time
    start = time()
    for i in range(0, reps):
        func()
    end = time()
    return (end - start) / reps

例子：

import os
listdir_time = timereps(10000, lambda: os.listdir('/'))
print "python can do %d os.listdir('/') per second" % (1 / listdir_time)

对我来说，它说：

python can do 40925 os.listdir('/') per second

这是一种原始的基准测试，但已经足够了。

【讨论】：

@exhuma，我忘记了细节，也许我的评估太仓促了！我想我说“奇怪”是因为它需要两块代码作为字符串（而不是函数/lambda）。但是我可以在对非常短的运行时间段进行计时时看到它的价值。我想我说“慢”是因为它默认为 1,000,000 个循环，而我没有研究如何调整它！我喜欢我的代码已经除以代表的数量。但是 timeit 无疑是一个更好的解决方案，我很抱歉对它进行了抨击。

【解决方案4】：

我通常会快速发送time ./script.py 以查看需要多长时间。但是，这并没有向您显示内存，至少不是默认值。您可以使用/usr/bin/time -v ./script.py 获取大量信息，包括内存使用情况。

【讨论】：

请记住，此命令/usr/bin/time 和-v 选项在许多发行版中默认不可用，必须安装。 sudo apt-get install time 在 debian、ubuntu 等中。pacman -S timearchlinux

【解决方案5】：

满足您所有内存需求的内存分析器。

https://pypi.python.org/pypi/memory_profiler

运行 pip 安装：

pip install memory_profiler

导入库：

import memory_profiler

为您要配置的项目添加装饰器：

@profile
def my_func():
    a = [1] * (10 ** 6)
    b = [2] * (2 * 10 ** 7)
    del b
    return a

if __name__ == '__main__':
    my_func()

执行代码：

python -m memory_profiler example.py

接收输出：

 Line #    Mem usage  Increment   Line Contents
 ==============================================
 3                           @profile
 4      5.97 MB    0.00 MB   def my_func():
 5     13.61 MB    7.64 MB       a = [1] * (10 ** 6)
 6    166.20 MB  152.59 MB       b = [2] * (2 * 10 ** 7)
 7     13.61 MB -152.59 MB       del b
 8     13.61 MB    0.00 MB       return a

示例来自上面链接的文档。

【讨论】：

【解决方案6】：

snakeviz cProfile 的交互式查看器

https://github.com/jiffyclub/snakeviz/

https://stackoverflow.com/a/1593034/895245 和 snakeviz was mentioned in a comment 提到了 cProfile，但我想进一步强调它。

仅通过查看 cprofile / pstats 输出来调试程序性能非常困难，因为它们只能对每个函数进行开箱即用的总时间。

然而，一般来说，我们真正需要的是查看包含每个调用的堆栈跟踪的嵌套视图，以便真正轻松地找到主要瓶颈。

这正是snakeviz 通过其默认的“冰柱”视图提供的。

首先您必须将 cProfile 数据转储到二进制文件中，然后您可以在该文件上进行蛇形可视化

pip install -u snakeviz
python -m cProfile -o results.prof myscript.py
snakeviz results.prof

这会打印一个指向标准输出的 URL，您可以在浏览器上打开该 URL，其中包含所需的输出，如下所示：

然后你可以：

悬停每个框以查看包含函数的文件的完整路径
单击一个框以使该框显示在顶部以作为放大的一种方式

更多面向个人资料的问题：How can you profile a Python script?

【讨论】：

很好的答案！是否有允许我更好地过滤结果的选项？例如，我只对分析我自己的函数感兴趣。我还想分析属于特定脚本或类的所有函数。这可能吗？
@Samuel 谢谢！对不起，我不知道过滤。如果您最终发现了什么或提出了新问题，请留下另一条评论。

【解决方案7】：

查看nose 及其插件之一，尤其是this one。

安装后，nose 是您路径中的一个脚本，您可以在包含一些 python 脚本的目录中调用它：

$: nosetests

这将查看当前目录中的所有 python 文件，并将执行它识别为测试的任何函数：例如，它将名称中包含单词 test_ 的任何函数识别为测试。

因此，您可以创建一个名为 test_yourfunction.py 的 Python 脚本并在其中编写如下内容：

$: cat > test_yourfunction.py

def test_smallinput():
    yourfunction(smallinput)

def test_mediuminput():
    yourfunction(mediuminput)

def test_largeinput():
    yourfunction(largeinput)

然后你必须运行

$: nosetest --with-profile --profile-stats-file yourstatsprofile.prof testyourfunction.py

要读取配置文件，请使用以下 python 行：

python -c "import hotshot.stats ; stats = hotshot.stats.load('yourstatsprofile.prof') ; stats.sort_stats('time', 'calls') ; stats.print_stats(200)"

【讨论】：

在我看来，这与标准 python 库中的分析器相同。测试不是问题的主题。另外：nose 依赖于能手。自 Python 2.5 起不再维护它，仅保留“用于特殊用途”

【解决方案8】：

小心timeit 非常慢，我的中型处理器需要 12 秒才能初始化（或者可能运行该函数）。你可以测试这个接受的答案

def test():
    lst = []
    for i in range(100):
        lst.append(i)

if __name__ == '__main__':
    import timeit
    print(timeit.timeit("test()", setup="from __main__ import test")) # 12 second

对于简单的事情，我将使用 time 代替，在我的 PC 上它返回结果 0.0

import time

def test():
    lst = []
    for i in range(100):
        lst.append(i)

t1 = time.time()

test()

result = time.time() - t1
print(result) # 0.000000xxxx

【讨论】：

timeit 运行您的函数许多次，以平均噪声。重复次数是一个选项，请参阅Benchmarking run times in python 或此问题已接受答案的后面部分。

【解决方案9】：

快速测试任何函数的简单方法是使用以下语法： %timeit my_code

例如：

%timeit a = 1

13.4 ns ± 0.781 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)

【讨论】：

【解决方案10】：

如果您不想为 timeit 编写样板代码并易于分析结果，请查看 benchmarkit。它还保存了以前运行的历史记录，因此很容易在开发过程中比较相同的功能。

# pip install benchmarkit

from benchmarkit import benchmark, benchmark_run

N = 10000
seq_list = list(range(N))
seq_set = set(range(N))

SAVE_PATH = '/tmp/benchmark_time.jsonl'

@benchmark(num_iters=100, save_params=True)
def search_in_list(num_items=N):
    return num_items - 1 in seq_list

@benchmark(num_iters=100, save_params=True)
def search_in_set(num_items=N):
    return num_items - 1 in seq_set

benchmark_results = benchmark_run(
   [search_in_list, search_in_set],
   SAVE_PATH,
   comment='initial benchmark search',
)

打印到终端并返回包含上次运行数据的字典列表。命令行入口点也可用。

如果你更改N=1000000并重新运行

【讨论】：

【解决方案11】：

基于刘丹云的一些便利功能的回答，也许它对某人有用。

def stopwatch(repeat=1, autorun=True):
    """
    stopwatch decorator to calculate the total time of a function
    """
    import timeit
    import functools
    
    def outer_func(func):
        @functools.wraps(func)
        def time_func(*args, **kwargs):
            t1 = timeit.default_timer()
            for _ in range(repeat):
                r = func(*args, **kwargs)
            t2 = timeit.default_timer()
            print(f"Function={func.__name__}, Time={t2 - t1}")
            return r
        
        if autorun:
            try:
                time_func()
            except TypeError:
                raise Exception(f"{time_func.__name__}: autorun only works with no parameters, you may want to use @stopwatch(autorun=False)") from None
        
        return time_func
    
    if callable(repeat):
        func = repeat
        repeat = 1
        return outer_func(func)
    
    return outer_func

一些测试：

def is_in_set(x):
    return x in {"linux", "darwin"}

def is_in_list(x):
    return x in ["linux", "darwin"]

@stopwatch
def run_once():
    import time
    time.sleep(0.5)

@stopwatch(autorun=False)
def run_manually():
    import time
    time.sleep(0.5)

run_manually()

@stopwatch(repeat=10000000)
def repeat_set():
    is_in_set("windows")
    is_in_set("darwin")

@stopwatch(repeat=10000000)
def repeat_list():
    is_in_list("windows")
    is_in_list("darwin")

@stopwatch
def should_fail(x):
    pass

结果：

Function=run_once, Time=0.5005391679987952
Function=run_manually, Time=0.500624185999186
Function=repeat_set, Time=1.7064883739985817
Function=repeat_list, Time=1.8905151920007484
Traceback (most recent call last):
  (some more traceback here...)
Exception: should_fail: autorun only works with no parameters, you may want to use @stopwatch(autorun=False)

【讨论】：

【解决方案12】：

line_profiler（逐行执行时间）

安装

pip install line_profiler

用法

在函数前添加@profile 装饰器。例如：

@profile
def function(base, index, shift):
    addend = index << shift
    result = base + addend
    return result

使用命令kernprof -l <file_name> 创建line_profiler 的实例。例如：

kernprof -l test.py

kernprof 将在成功时打印Wrote profile results to <file_name>.lprof。例如：

Wrote profile results to test.py.lprof

使用命令python -m line_profiler <file_name>.lprof 打印基准测试结果。例如：

python -m line_profiler test.py.lprof

你会看到每行代码的详细信息：

Timer unit: 1e-06 s

Total time: 0.0021632 s
File: test.py
Function: function at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     1                                           @profile
     2                                           def function(base, index, shift):
     3      1000        796.4      0.8     36.8      addend = index << shift
     4      1000        745.9      0.7     34.5      result = base + addend
     5      1000        620.9      0.6     28.7      return result

memory_profiler（内存使用逐行）

安装

pip install memory_profiler

用法

在函数前添加@profile 装饰器。例如：

@profile
def function():
    result = []
    for i in range(10000):
        result.append(i)
    return result

使用命令python -m memory_profiler <file_name> 打印基准测试结果。例如：

python -m memory_profiler test.py

你会看到每行代码的详细信息：

Filename: test.py

Line #    Mem usage    Increment  Occurences   Line Contents
============================================================
     1   40.246 MiB   40.246 MiB           1   @profile
     2                                         def function():
     3   40.246 MiB    0.000 MiB           1       result = []
     4   40.758 MiB    0.008 MiB       10001       for i in range(10000):
     5   40.758 MiB    0.504 MiB       10000           result.append(i)
     6   40.758 MiB    0.000 MiB           1       return result

良好做法

多次调用函数以尽量减少对环境的影响。

【讨论】：