【问题标题】:Segmenting numpy arrays with as_strided使用 as_strided 分割 numpy 数组
【发布时间】:2018-07-20 00:00:07
【问题描述】:

我正在寻找一种将 numpy 数组分割成重叠块的有效方法。我知道numpy.lib.stride_tricks.as_strided 可能是要走的路,但我似乎无法理解它在适用于任意形状数组的通用函数中的用法。 Here are some examples for specific applications of as_strided.

这是我想要的:

import numpy as np
from numpy.lib.stride_tricks import as_strided

def segment(arr, axis, new_len, step=1, new_axis=None):
    """ Segment an array along some axis.

    Parameters
    ----------
    arr : array-like
        The input array.

    axis : int
        The axis along which to segment.

    new_len : int
        The length of each segment.

    step : int, default 1
        The offset between the start of each segment.

    new_axis : int, optional
        The position where the newly created axis is to be inserted. By
        default, the axis will be added at the end of the array.

    Returns
    -------
    arr_seg : array-like
        The segmented array.
    """

    # calculate shape after segmenting
    new_shape = list(arr.shape)
    new_shape[axis] = (new_shape[axis] - new_len + step) // step
    if new_axis is None:
        new_shape.append(new_len)
    else:
        new_shape.insert(new_axis, new_len)

    # TODO: calculate new strides
    strides = magic_command_returning_strides(...)

    # get view with new strides
    arr_seg = as_strided(arr, new_shape, strides)

    return arr_seg.copy()

所以我想指定要被切割成段的轴、段的长度以及它们之间的步长。此外,我想将插入新轴的位置作为参数传递。唯一缺少的是步幅的计算。

我知道这可能无法直接与as_strided 一起工作,即我可能需要实现一个子例程,该子例程返回一个带有step=1new_axis 在固定位置的跨步视图,然后与想要的step 然后转置。

这是一段有效的代码,但显然很慢:

def segment_slow(arr, axis, new_len, step=1, new_axis=None):
    """ Segment an array along some axis. """

    # calculate shape after segmenting
    new_shape = list(arr.shape)
    new_shape[axis] = (new_shape[axis] - new_len + step) // step
    if new_axis is None:
        new_shape.append(new_len)
    else:
        new_shape.insert(new_axis, new_len)

    # check if the new axis is inserted before the axis to be segmented
    if new_axis is not None and new_axis <= axis:
        axis_in_arr_seg = axis + 1
    else:
        axis_in_arr_seg = axis

    # pre-allocate array
    arr_seg = np.zeros(new_shape, dtype=arr.dtype)

    # setup up indices
    idx_old = [slice(None)] * arr.ndim
    idx_new = [slice(None)] * len(new_shape)

    # get order of transposition for assigning slices to the new array
    order = list(range(arr.ndim))
    if new_axis is None:
        order[-1], order[axis] = order[axis], order[-1]
    elif new_axis > axis:
        order[new_axis-1], order[axis] = order[axis], order[new_axis-1]

    # loop over the axis to be segmented
    for n in range(new_shape[axis_in_arr_seg]):
        idx_old[axis] = n * step + np.arange(new_len)
        idx_new[axis_in_arr_seg] = n
        arr_seg[tuple(idx_new)] = np.transpose(arr[idx_old], order)

    return arr_seg

这是基本功能的测试:

import numpy.testing as npt    

arr = np.array([[0, 1, 2, 3],
                [4, 5, 6, 7],
                [8, 9, 10, 11]])

arr_seg_1 = segment_slow(arr, axis=1, new_len=3, step=1)
arr_target_1 = np.array([[[0, 1, 2], [1, 2, 3]],
                         [[4, 5, 6], [5, 6, 7]],
                         [[8, 9, 10], [9, 10, 11]]])

npt.assert_allclose(arr_target_1, arr_seg_1)

arr_seg_2 = segment_slow(arr, axis=1, new_len=3, step=1, new_axis=1)
arr_target_2 = np.transpose(arr_target_1, (0, 2, 1))

npt.assert_allclose(arr_target_2, arr_seg_2)

arr_seg_3 = segment_slow(arr, axis=0, new_len=2, step=1)
arr_target_3 = np.array([[[0, 4], [1, 5], [2, 6], [3, 7]],
                         [[4, 8], [5, 9], [6, 10], [7, 11]]])

npt.assert_allclose(arr_target_3, arr_seg_3)

任何帮助将不胜感激!

【问题讨论】:

  • 恭喜您提出了非常完善的问题! :)
  • My attempt at a canonical answer 执行所有这些操作,除了将新轴滚动到指定位置。应该很容易在末尾添加np.rollaxis
  • 感谢@DanielF,我已经成功实现了我需要的功能,并使用了您的函数的包装器。我将根据您的方法发布我的问题的答案。

标签: python arrays performance numpy


【解决方案1】:

根据DanielF 的评论和他的answer here,我实现了这样的功能:

def segment(arr, axis, new_len, step=1, new_axis=None, return_view=False):
    """ Segment an array along some axis.

    Parameters
    ----------
    arr : array-like
        The input array.

    axis : int
        The axis along which to segment.

    new_len : int
        The length of each segment.

    step : int, default 1
        The offset between the start of each segment.

    new_axis : int, optional
        The position where the newly created axis is to be inserted. By
        default, the axis will be added at the end of the array.

    return_view : bool, default False
        If True, return a view of the segmented array instead of a copy.

    Returns
    -------
    arr_seg : array-like
        The segmented array.
    """

    old_shape = np.array(arr.shape)

    assert new_len <= old_shape[axis],  \
        "new_len is bigger than input array in axis"
    seg_shape = old_shape.copy()
    seg_shape[axis] = new_len

    steps = np.ones_like(old_shape)
    if step:
        step = np.array(step, ndmin = 1)
        assert step > 0, "Only positive steps allowed"
        steps[axis] = step

    arr_strides = np.array(arr.strides)

    shape = tuple((old_shape - seg_shape) // steps + 1) + tuple(seg_shape)
    strides = tuple(arr_strides * steps) + tuple(arr_strides)

    arr_seg = np.squeeze(
        as_strided(arr, shape = shape, strides = strides))

    # squeeze will move the segmented axis to the first position
    arr_seg = np.moveaxis(arr_seg, 0, axis)

    # the new axis comes right after
    if new_axis is not None:
        arr_seg = np.moveaxis(arr_seg, axis+1, new_axis)
    else:
        arr_seg = np.moveaxis(arr_seg, axis+1, -1)

    if return_view:
        return arr_seg
    else:
        return arr_seg.copy()

这适用于我的一维片段的情况,但是,我建议任何寻找适用于任意维度片段的方法的人检查链接答案中的代码。

【讨论】:

  • 随时为这个问题投票,以便其他人也能找到它!抱歉,我没有时间写出完整的答案。还要注意np.moveaxis 将创建as_strided 视图的副本,因此如果您创建的视图比原始数组大得多,此方法可能会导致内存错误。
  • documentation 表示moveaxis 返回一个视图...
  • 哎呀!我不知道我当时在想哪个命令。应该不错!
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2011-11-24
  • 2021-06-10
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多