【发布时间】:2017-12-10 11:37:04
【问题描述】:
我正在编写一个算法,我需要根据不同节点的集群分配来“折叠”或“减少”矩阵。但是,目前的实现是我完整算法的瓶颈(在 Visual Studio Python 分析器中测试)。
def reduce_matrix(mat: np.matrix, cluster_ids: np.array) -> np.matrix:
"""Reduce node adjacency matrix.
Arguments:
mat: Adjacency matrix
cluster_ids: Cluster membership assignment per current node (integers)
Returns:
Reduced adjacency matrix
"""
ordered_nodes = np.argsort(cluster_ids)
counts = np.unique(cluster_ids, return_counts=True)[1]
ends = np.cumsum(counts)
starts = np.concatenate([[0], ends[:-1]])
clusters = [ordered_nodes[start:end] for start, end in zip(starts, ends)]
n_c = len(counts)
reduced = np.mat(np.zeros((n_c, n_c), dtype=int))
for a in range(n_c):
a_nodes = clusters[a]
for b in range(a + 1, n_c):
b_nodes = clusters[b]
reduced[a, b] = np.sum(mat[a_nodes, :][:, b_nodes])
reduced[b, a] = np.sum(mat[b_nodes, :][:, a_nodes])
return reduced
对矩阵中的任意行和列求和的最快方法是什么?
我相信双索引 [a_nodes, :][:, b_nodes] 会创建矩阵的副本而不是视图,但我不确定是否有更快的解决方法...
【问题讨论】:
标签: python performance numpy matrix sum