【发布时间】:2020-08-21 04:48:48
【问题描述】:
我正在使用 python 进行图形聚类。该算法要求从图G 传递的数据应该是邻接矩阵。但是,为了得到adjacency-matrix 为numpy-array 像这样:
import networkx as nx
matrix = nx.to_numpy_matrix(G)
我收到内存错误。留言是MemoryError: Unable to allocate 2.70 TiB for an array with shape (609627, 609627) and data type float64
但是,我的设备是新设备 (Lenovo E490),Windows 64 位,内存 8 Gb
其他重要信息可能是:
Number of nodes: 609627
Number of edges: 915549
整个故事如下:
Graphtype = nx.Graph()
G = nx.from_pandas_edgelist(df, 'source','target', edge_attr='weight', create_using=Graphtype)
马尔可夫聚类
import markov_clustering as mc
import networkx as nx
matrix = nx.to_scipy_sparse_matrix(G) # build the matrix
result = mc.run_mcl(matrix) # run MCL with default parameters
MemoryError
【问题讨论】:
标签: python pandas numpy cluster-analysis networkx