为什么假设认为这段代码很慢？答案

【问题标题】：Why does hypothesis consider this code slow?为什么假设认为这段代码很慢？
【发布时间】：2021-10-14 11:20:08
【问题描述】：

Hypothesis 强烈抱怨这很慢：

@composite
def f_and_g_and_padding(draw, in_channels = channel_ints, out_channels = channel_ints, fs = shapes_2d, fill=None, elements=well_behaved_floats):
    shape_f = draw(basic_shape)
    padding = draw(shapes_2d)
    fs = draw(fs)
    in_channels = draw(in_channels)
    out_channels = draw(out_channels)
    batch_size = draw(shape_ints)
    shape_f = (batch_size, in_channels, fs[0], fs[1])
    f = draw(stnp.arrays(dt_numpy, shape_f, elements=elements, fill=fill))
    h_in = f.shape[2] + padding[0] * 2
    w_in = f.shape[3] + padding[1] * 2
    shape_g = (out_channels, in_channels, h_in, w_in)
    g = draw(stnp.arrays(dt_numpy, shape_g, elements=elements, fill=fill))
    
    return (f, g, padding)

我试图找出原因，但失败了。见：How to use pytest, hypothesis and line_profiler / kernprof together?。

所以，我的问题仍然存在：为什么？

以下是使用的其他策略：

well_behaved_floats = stnp.from_dtype(dtype=dt_numpy, allow_infinity=False, allow_nan=False)
small_floats = stnp.from_dtype(dtype=dt_numpy, min_value=-10000, max_value=10000, allow_infinity=False, allow_nan=False)
floats_0_1 = stnp.from_dtype(dtype=dt_numpy, min_value=-1, max_value=1, allow_infinity=False, allow_nan=False)
small_ints = stnp.from_dtype(dtype=numpy.dtype("i4"), allow_infinity=False, allow_nan=False, min_value=-10, max_value=10)
small_positive_ints = stnp.from_dtype(dtype=numpy.dtype("i4"), allow_infinity=False, allow_nan=False, min_value=0, max_value=10)
one_or_greater = st.integers(min_value=1)
shape_ints = st.integers(min_value=1, max_value=4)
channel_ints = st.integers(min_value=1, max_value=10)
basic_shape = stnp.array_shapes(min_dims=4, max_dims=4, min_side=1, max_side=10)
ones = st.integers(min_value=1, max_value=1)

shapes_2d = stnp.array_shapes(min_dims=2, max_dims=2, min_side=1, max_side=4)

这样使用：

@given(f_and_g_and_padding(elements=ones))
def test_padding(f_g_padding: Tuple[numpy.ndarray, numpy.ndarray, Tuple[int, int]]):
    f, g, padding = f_g_padding
    run_test(Tensor(f), Tensor(g), padding=padding)

没有过滤，只有简单的绘图和 numpy 数组。

这是假设配置：

hypothesis.settings.register_profile("default",
                                     derandomize=True,
                                     deadline=None,
                                     print_blob=True,
                                     report_multiple_bugs=False,
                                     suppress_health_check=[HealthCheck.too_slow])

【问题讨论】：

标签： python python-hypothesis

【解决方案1】：

我希望你的basic_shapes 策略是罪魁祸首；至少有四个维度，您已经进入平均边长的 n^4 个元素，这会很慢。考虑减少此策略的max_side；如果这是不可接受的，您可能需要使用 Hypothesis 生成形状，但使用 numpy.random 生成元素。

我还建议不要将 allow_infinity=False, allow_nan=False 传递给整数或有界浮点数策略 - 在任何一种情况下，都已经排除了非有限数字，因此虽然它们不做任何事情，但它会影响可读性。

【讨论】：

有趣的想法！形状需要是 4 维，因此减少 max_size 不是一种选择。我没有考虑过 numpy.random 可能比 draw(stnp.arrays(...)) 更快，我会尝试一下。我还删除了 nan 和 infinity 检查，那些是复制粘贴剩菜，很好。在我提交了一个问题后，我还注意到我正在运行 pytest 的机器由于其他用户而有 100% 的 CPU 使用率，所以如果 pytest 只检查经过的时间（而不是例如经过的进程时间或执行CPU 指令）肯定会做到的。
IIRC pytest 确实考虑了walltime，所以这可能是其中的重要组成部分！但也请注意，您可以离开min_dims=4，但减少到例如max_side=5;尺寸相同，但元素少得多。