在numpy中使用as_strided函数的滑动窗口？答案

【问题标题】：Sliding window using as_strided function in numpy?在numpy中使用as_strided函数的滑动窗口？
【发布时间】：2011-11-24 10:08:16
【问题描述】：

当我开始使用 python 实现一个滑动窗口来检测静止图像中的对象时，我开始了解这个不错的功能：

numpy.lib.stride_tricks.as_strided

所以我试图实现一个通用规则，以避免在更改我需要的滑动窗口大小时可能会失败的错误。最后我得到了这个表示：

all_windows = as_strided(x,((x.shape[0] - xsize)/xstep ,(x.shape[1] - ysize)/ystep ,xsize,ysize), (x.strides[0]*xstep,x.strides[1]*ystep,x.strides[0],x.strides[1])

这会产生一个 4 暗矩阵。前两个代表图像的 x 和 y 轴上的窗口数。其他的代表窗口的大小(xsize,ysize)

而step 表示两个连续窗口之间的位移。

如果我选择方形滑动窗口，这种表示效果很好。但我仍然有一个问题要让它适用于 e.x. 的 Windows。 (128,64)，我通常会在其中获得与图像无关的数据。

我的代码有什么问题。有任何想法吗？是否有更好的方法在 python 中让滑动窗口美观整洁地进行图像处理？

谢谢

【问题讨论】：

由于您正在寻找模板匹配算法，this post 使用 strides 可能值得一看。

标签： image-processing numpy computer-vision scipy

【解决方案1】：

查看此问题的答案：Using strides for an efficient moving average filter。基本上跨步不是一个很好的选择，尽管它们有效。

【讨论】：

那么，为了检测和模板匹配算法，有没有更好的替代方案可以快速而简洁地提取不同尺度的滑动窗口？？
我需要提取这样的子窗口，以便在窗口中获取一些 HOG 特征，并将这个实例与已经训练好的分类器进行分类，以检查它是否是我关心的子窗口。
为了论证，是的，类似的东西，关于 HOG，我以前使用嵌套循环来实现它来提取它的单元格直方图，我认为步幅技巧可以节省一些性能并消除使用2 个循环在这种情况下使用 python 或 matlab 之类的语言非常昂贵。
@JustInTime 嗨，你能解决你的问题吗？我还有一些关于滑动窗口方法的问题，可以联系你吗？
看看view_as_windows希望对你有帮助。

【解决方案2】：

您的代码存在问题。实际上，此代码适用于 2D，没有理由使用多维版本 (Using strides for an efficient moving average filter)。以下是固定版本：

A = np.arange(100).reshape((10, 10))
print A
all_windows = as_strided(A, ((A.shape[0] - xsize + 1) / xstep, (A.shape[1] - ysize + 1) / ystep, xsize, ysize),
      (A.strides[0] * xstep, A.strides[1] * ystep, A.strides[0], A.strides[1]))
print all_windows

【讨论】：

【解决方案3】：

为了后验性：

这是在 scikit-learn 中的函数 sklearn.feature_extraction.image.extract_patches 中实现的。

【讨论】：

我相信这种方法会从原始图像复制补丁，而且似乎跨步方法是专门用来避免这样做的。
作为替代方案，请参阅 skimage.util.view_as_windows (scikit-image.org/docs/dev/api/…)
@AlexKlibisz extract_patches 以最一般的形式使用步幅（适用于任意维度的所有ndarrays，您可以在任意步长处提取任意形状的补丁）。 extract_patches_2d 使用此函数，但调用了 reshape 并因此产生了一个副本（这对于 2D 情况是需要的）。 [完全披露：我写了extract_patches]
看起来skimage 函数实现了完全相同的功能。为了完整起见，这里是source of extract_patches
你是对的，我的错。我错误地使用了extract_patches_2d 而不是extract_patches。

【解决方案4】：

我有一个类似的用例，我需要在一批多通道图像上创建滑动窗口，并最终提出了以下功能。 I've written a more in-depth blog post covering this in regards to manually creating a Convolution layer。该函数实现了滑动窗口，还包括对输入数组进行扩展或添加填充。

函数作为输入：

input - Size of (Batch, Channel, Height, Width) output_size - Depends on usage, comments below. kernel_size - size of the sliding window you wish to create (square) padding - amount of 0-padding added to the outside of the (H,W) dimensions stride - stride the sliding window should take over the inputs dilate - amount to spread the cells of the input. This adds 0-filled rows/cols between elements

通常，在执行前向卷积时，您不需要执行膨胀，因此可以使用以下公式找到输出大小（将 x 替换为输入维度）：

(x - kernel_size + 2 * padding) // stride + 1

使用此函数执行卷积的反向传递时，使用步幅 1 并将 output_size 设置为正向传递的 x-input 的大小

可以在at this link找到示例代码以及使用此功能的示例。

def getWindows(input, output_size, kernel_size, padding=0, stride=1, dilate=0):
    working_input = input
    working_pad = padding
    # dilate the input if necessary
    if dilate != 0:
        working_input = np.insert(working_input, range(1, input.shape[2]), 0, axis=2)
        working_input = np.insert(working_input, range(1, input.shape[3]), 0, axis=3)

    # pad the input if necessary
    if working_pad != 0:
        working_input = np.pad(working_input, pad_width=((0,), (0,), (working_pad,), (working_pad,)), mode='constant', constant_values=(0.,))

    in_b, in_c, out_h, out_w = output_size
    out_b, out_c, _, _ = input.shape
    batch_str, channel_str, kern_h_str, kern_w_str = working_input.strides

    return np.lib.stride_tricks.as_strided(
        working_input,
        (out_b, out_c, out_h, out_w, kernel_size, kernel_size),
        (batch_str, channel_str, stride * kern_h_str, stride * kern_w_str, kern_h_str, kern_w_str)
    )

【讨论】：