【问题标题】:How to iterate over a row in a SciPy sparse matrix?如何迭代 SciPy 稀疏矩阵中的一行?
【发布时间】:2017-01-21 00:20:19
【问题描述】:

我有一个稀疏矩阵随机矩阵创建如下:

import numpy as np
from scipy.sparse import rand
foo = rand(100, 100, density=0.1, format='csr')

我想遍历特定行中的单元格并执行两个计算:

row1 = foo.getrow(bar1)
row2 = foo.getrow(bar2)

"""
Like the following:
sum1 = 0
sum2 = 0
for each cell x in row1:
    sum1 += x
    if the corresponding cell (in the same column) in row2 y is non-zero:
        sum2 += x*y
"""

【问题讨论】:

  • 您是在寻找解决方案还是高效的解决方案?
  • 一个高效的,例如我宁愿不使用“todense”。

标签: python numpy scipy


【解决方案1】:

这是一种方法-

# Get first row summation by simply using sum method of sparse matrix
sum1 = row1.sum()

# Get the non-zero indices of first row
idx1 = row1.indices
data1 = row1.data  # Or get sum1 here with : `data1.sum()`.

# Get the non-zero indices of second row and corresponding data
idx2 = row2.indices
data2 = row2.data

# Get mask of overlap from row1 nonzeros on row2 nonzeros. 
# Select those from data2 and sum those up for the second summation o/p.
sum2 = data1[np.in1d(idx1,idx2)].dot(data2[np.in1d(idx2,idx1)])

或者,正如comments by @user2357112 中所建议的那样,我们可以简单地使用matrix-multiplication 来获得第二个求和 -

sum2 = sum((row1*row2.T).data)

【讨论】:

  • 在我看来您可以使用row1 * row2.T 来简化大部分sum2 计算。
  • @user2357112 好主意!添加到帖子中。谢谢!
  • 我收到 (row1*row2.T).data[0] 的间歇性错误。如果没有碰撞,这是否有效,例如当产品为 0 时?
  • @tyebillion 是的,这确实是一个错误。现在应该修好了。看看吧!
猜你喜欢
  • 2012-05-15
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2018-08-21
  • 2017-03-26
  • 2017-03-31
  • 2023-04-10
  • 2017-07-21
相关资源
最近更新 更多