【问题标题】:Removing diagonal elements from a sparse matrix in scipy从scipy中的稀疏矩阵中删除对角线元素
【发布时间】:2023-03-11 23:32:01
【问题描述】:

我想从稀疏矩阵中删除对角线元素。由于矩阵是稀疏的,这些元素一旦被移除就不应存储。

Scipy 提供了一种设置对角元素值的方法:setdiag

如果我尝试使用 lil_matrix,它会起作用:

>>> a = np.ones((2,2))
>>> c = lil_matrix(a)
>>> c.setdiag(0)
>>> c
<2x2 sparse matrix of type '<type 'numpy.float64'>'
    with 2 stored elements in LInked List format>

但是使用 csr_matrix,似乎对角元素不会从存储中删除:

>>> b = csr_matrix(a)
>>> b
<2x2 sparse matrix of type '<type 'numpy.float64'>'
    with 4 stored elements in Compressed Sparse Row format>

>>> b.setdiag(0)
>>> b
<2x2 sparse matrix of type '<type 'numpy.float64'>'
    with 4 stored elements in Compressed Sparse Row format>

>>> b.toarray()
array([[ 0.,  1.],
       [ 1.,  0.]])

通过一个密集的数组,我们当然有:

>>> csr_matrix(b.toarray())
<2x2 sparse matrix of type '<type 'numpy.float64'>'
    with 2 stored elements in Compressed Sparse Row format>

这是故意的吗?如果是这样,是因为 csr 矩阵的压缩格式吗?除了从稀疏到密集再到稀疏之外,还有其他解决方法吗?

【问题讨论】:

    标签: python scipy sparse-matrix


    【解决方案1】:

    简单地将元素设置为 0 不会改变 csr 矩阵的稀疏性。你必须申请eliminate_zeros

    In [807]: a=sparse.csr_matrix(np.ones((2,2)))
    In [808]: a
    Out[808]: 
    <2x2 sparse matrix of type '<class 'numpy.float64'>'
        with 4 stored elements in Compressed Sparse Row format>
    In [809]: a.setdiag(0)
    In [810]: a
    Out[810]: 
    <2x2 sparse matrix of type '<class 'numpy.float64'>'
        with 4 stored elements in Compressed Sparse Row format>
    In [811]: a.eliminate_zeros()
    In [812]: a
    Out[812]: 
    <2x2 sparse matrix of type '<class 'numpy.float64'>'
        with 2 stored elements in Compressed Sparse Row format>
    

    由于更改 csr 矩阵的稀疏度相对昂贵,因此它们允许您将值更改为 0 而无需更改稀疏度。

    In [829]: %%timeit a=sparse.csr_matrix(np.ones((1000,1000)))
         ...: a.setdiag(0)
    100 loops, best of 3: 3.86 ms per loop
    
    In [830]: %%timeit a=sparse.csr_matrix(np.ones((1000,1000)))
         ...: a.setdiag(0)
         ...: a.eliminate_zeros()
    SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
    10 loops, best of 3: 133 ms per loop
    
    In [831]: %%timeit a=sparse.lil_matrix(np.ones((1000,1000)))
         ...: a.setdiag(0)
    100 loops, best of 3: 14.1 ms per loop
    

    【讨论】:

    • 正是我错过的。谢谢!
    猜你喜欢
    • 2016-12-28
    • 2014-05-04
    • 2013-08-09
    • 2017-01-15
    • 2011-03-07
    • 2015-09-20
    • 2020-04-21
    • 2017-07-02
    • 2016-12-21
    相关资源
    最近更新 更多