减少的最大值的 Numpy 索引 - numpy.argmax.reduceat答案

【问题标题】：Numpy index of the maximum with reduction - numpy.argmax.reduceat减少的最大值的 Numpy 索引 - numpy.argmax.reduceat
【发布时间】：2017-06-09 13:54:42
【问题描述】：

我有一个平面数组b：

a = numpy.array([0, 1, 1, 2, 3, 1, 2])

还有一个索引数组c 标记每个“块”的开始：

b = numpy.array([0, 4])

我知道我可以通过减少找到每个“块”中的最大值：

m = numpy.maximum.reduceat(a,b)
>>> array([2, 3], dtype=int32)

但是...有没有办法在一个块</edit>（如numpy.argmax）中找到最大<edit>的索引，使用矢量化操作（没有列表，循环）？

【问题讨论】：

暂时删除了我的问题，因为我认为我有答案：numpy.argmax(numpy.equal.outer(m,a), axis=1)，但这不适用于在许多地方出现相同最大值的示例...
例如在这个数组上：a = numpy.array([0, 1, 1, 3, 3, 1, 2])，其中两个块中出现相同的最大值3。
问题在于np.maximum 是ufunc 和reduceat - 它有效地遍历数组，一次比较2 个值。但是np.max 和np.argmax 是同时对整个数组进行操作的函数。他们不是ufunc。
@hpaulj，是的，我知道这一点。我在问是否有人可以想到具有相同行为的解决方法。

标签： numpy vectorization reduction argmax numpy-ufunc

【解决方案1】：

借鉴this post的想法。

涉及的步骤：

将组中的所有元素偏移一个限制偏移量。对它们进行全局排序，从而限制每个组停留在它们的位置，但对每个组内的元素进行排序。
在排序后的数组中，我们将查找最后一个元素，即组最大值。它们的索引将是向下偏移组长度后的 argmax。

因此，矢量化实现将是 -

def numpy_argmax_reduceat(a, b):
    n = a.max()+1  # limit-offset
    grp_count = np.append(b[1:] - b[:-1], a.size - b[-1])
    shift = n*np.repeat(np.arange(grp_count.size), grp_count)
    sortidx = (a+shift).argsort()
    grp_shifted_argmax = np.append(b[1:],a.size)-1
    return sortidx[grp_shifted_argmax] - b

作为一个小调整并且可能更快，我们可以替代地创建shift 和cumsum，从而有一个早期方法的变体，就像这样 -

def numpy_argmax_reduceat_v2(a, b):
    n = a.max()+1  # limit-offset
    id_arr = np.zeros(a.size,dtype=int)
    id_arr[b[1:]] = 1
    shift = n*id_arr.cumsum()
    sortidx = (a+shift).argsort()
    grp_shifted_argmax = np.append(b[1:],a.size)-1
    return sortidx[grp_shifted_argmax] - b

【讨论】：

在我的情况下，这两种解决方案都能很好地工作，因为我已经从早期的操作中获得了 shift。很好的答案。
嘿，几个月前你已经回答了我的问题，你能看看这个帖子吗：stackoverflow.com/questions/67680199/…。它与这个问题有关。