【问题标题】:RuntimeError filtering the edges with weight below the threshold - NetworkxRuntimeError 过滤权重低于阈值的边缘 - Networkx
【发布时间】:2020-03-26 00:56:20
【问题描述】:

我正在使用 Python 和 networkx,这是我使用此工具的第一个项目。我想制作一个图表来分析一些字符串之间的相似性。 仅供参考,我使用余弦相似度来计算字符串之间的相似度。

以下是我目前使用的代码:

skills=[]

for i in data['skills']:
    skills.append(i)


def clean_string(text):
    text = ''.join([word for word in text if word not in string.punctuation])
    text = text.lower()
    text = ' '.join([word for word in text.split() if word not in stop_words])
    return text

cleaned = list(map(clean_string, skills))
# print(cleaned)

vectorizer = CountVectorizer().fit_transform(cleaned)
vectors = vectorizer.toarray()
# print(vectors)

csim = cosine_similarity(vectors)

我希望余弦相似度是我网络中边的权重。

G = nx.from_numpy_matrix(np.matrix(csim), create_using=nx.DiGraph)

然后我尝试过滤权重高于阈值 0.2 的边缘。

def slice_network(G, T, data = True):
    """ Remove all edges with weight<T from G or its copy. """
    F = G.copy() if copy else G
    F.remove_edges_from((n1, n2) for n1, n2, w in F.edges(data="weight") if w < T)
    return G

F = slice_network(G, 0.2)
print(F.edges())

但是,它给我带来了错误:

RuntimeError:字典在迭代期间改变了大小

有人可以帮我吗?

【问题讨论】:

    标签: python networking networkx


    【解决方案1】:

    您只需将[] 添加到您的remove_edges_from 调用中(并且您应该返回F 而不是G。根据您的其他问题,我创建了一个最小的可重现示例:

    import networkx as nx
    import numpy as np
    import matplotlib.pyplot as plt
    
    
    simple_weights = [[1., 0.51639778, 0., 0., 0., 0.],
                      [0.51639778, 1., 0., 0., 0., 0.25819889],
                      [0., 0., 1., 0., 0., 0.33333333],
                      [0., 0., 0., 1., 0.65465367, 0.],
                      [0., 0., 0., 0.65465367, 1., 0.],
                      [0., 0.25819889, 0.33333333, 0., 0., 1.]]
    
    
    G = nx.from_numpy_matrix(np.array(simple_weights), create_using=nx.DiGraph)
    nx.draw(G)
    plt.show()
    
    F = G.copy()
    threshold = 0.4
    F.remove_edges_from([(n1, n2) for n1, n2, w in F.edges(data="weight") if w < threshold])
    nx.draw(F)
    plt.show()
    
    

    或者作为你的函数(你没有在上面的代码中定义copy

    def slice_network(G, T, data = True):
        """ Remove all edges with weight<T from G or its copy. """
        F = G.copy() if copy else G
        F.remove_edges_from([(n1, n2) for n1, n2, w in F.edges(data="weight") if w < T])
        return F
    

    或作为创建前的过滤器

    threshold = 0.4
    simple_weights = np.array(simple_weights)
    simple_weights[simple_weights<threshold] = 0
    

    【讨论】:

      猜你喜欢
      • 2013-07-12
      • 1970-01-01
      • 1970-01-01
      • 2021-01-04
      • 2014-05-22
      • 1970-01-01
      • 2017-09-24
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多