是否可以在没有循环的情况下找到矩阵中的行之间的相似性？答案

【问题标题】：Is it possible to find similarities between rows in a matrix without loop?是否可以在没有循环的情况下找到矩阵中的行之间的相似性？
【发布时间】：2021-06-08 16:44:21
【问题描述】：

我有一个 2D numpy 数组。我正在尝试计算行之间的相似性并将其放入similarities 数组中。这可能没有循环吗？感谢您的宝贵时间！

# ratings.shape = (943, 1682)

arri = np.zeros(943)
arri = np.where(arri == 0)[0]

arrj = np.zeros(943)
arrj = np.where(arrj ==0)[0]

similarities = np.zeros((ratings.shape[0], ratings.shape[0]))

similarities[arri, arrj] = np.abs(ratings[arri]-ratings[arrj])

我想制作一个二维数组相似性，因为相似性[i, j] 是评级中第 i 行和第 j 行之间的差异

[ValueError: shape mismatch: value array of shape (943,1682) could not be broadcast to indexing result of shape (943,)] [1][1]：https://i.stack.imgur.com/gtst9.png

【问题讨论】：

我想制作一个二维数组similarities，因为相似性[i, j] 是ratings 中第 i 行和第 j 行之间的区别。

标签： python arrays numpy broadcasting

【解决方案1】：

问题是 numpy 在用两个数组索引二维数组时如何遍历数组。

首先进行一些设置：

import numpy;

ratings = numpy.arange(1, 6)

indicesX = numpy.indices((ratings.shape[0],1))[0]
indicesY = numpy.indices((ratings.shape[0],1))[0]

ratings:[1 2 3 4 5]

indicesX:[[0][1][2][3][4]]

indicesY:[[0][1][2][3][4]]

现在让我们看看你的程序产生了什么：

similarities = numpy.zeros((ratings.shape[0], ratings.shape[0]))
similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[0])

similarities:

[[0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 2. 0. 0.]
 [0. 0. 0. 3. 0.]
 [0. 0. 0. 0. 4.]]

如您所见，numpy 迭代 similarities 基本上如下所示：

for i in range(5):
    similarities[indicesX[i], indicesY[i]] = numpy.abs(ratings[i]-ratings[0])

similarities:

[[0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 2. 0. 0.]
 [0. 0. 0. 3. 0.]
 [0. 0. 0. 0. 4.]]

现在我们需要像下面这样的索引来遍历整个数组：

indecesX = [0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4]
indecesY = [0,0,0,0,0,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4]

我们这样做：

# Reshape indicesX from (x,1) to (x,). Thats important for numpy.tile().
indicesX = indicesX.reshape(indicesX.shape[0])
indicesX = numpy.tile(indicesX, ratings.shape[0])

indicesY = numpy.repeat(indicesY, ratings.shape[0])

indicesX:[0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4]

indicesY:[0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4]

完美！现在只需再次拨打similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY])，我们就会看到：

similarities:

[[0. 1. 2. 3. 4.]
 [1. 0. 1. 2. 3.]
 [2. 1. 0. 1. 2.]
 [3. 2. 1. 0. 1.]
 [4. 3. 2. 1. 0.]]

这里又是整个代码：

import numpy;

ratings = numpy.arange(1, 6)

indicesX = numpy.indices((ratings.shape[0],1))[0]
indicesY = numpy.indices((ratings.shape[0],1))[0]

similarities = numpy.zeros((ratings.shape[0], ratings.shape[0]))

indicesX = indicesX.reshape(indicesX.shape[0])
indicesX = numpy.tile(indicesX, ratings.shape[0])

indicesY = numpy.repeat(indicesY, ratings.shape[0])

similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY])
print(similarities)

PS

您对自己的帖子发表了评论以改进它。当您想改进问题时，您应该编辑您的问题，而不是对其发表评论。

【讨论】：

感谢您的帮助。你的解释很好，很容易理解。我得到了它。你救了我：D。谢谢大家！
我很高兴能帮助你:)。也许你可以接受我的回答（将是我第一次接受）！
我接受了。这也是我的第一个问题：D