【问题标题】:Updating weight information depending on repeat of edges with networkx使用networkx根据边缘的重复更新权重信息
【发布时间】:2012-04-03 14:38:03
【问题描述】:

我有一个 JSON 提要数据,其中包含很多用户关系,例如:

"subject_id = 1, object_id = 2, object = added 
subject_id = 1, object_id = 2, object = liked
subject_id = 1, object_id = 3, object = added
subject_id = 2, object_id = 1, object = added"

现在我使用以下代码将 JSON 转换为 networkx Graph:

def load(fname):
G = nx.DiGraph()
d = simplejson.load(open(fname))
for item in d:
    for attribute, value in item.iteritems():
        G.add_edge(value['subject_id'],value['object_id'])
return G

结果是这样的:

[('12820', '80842'), ('12820', '81312'), ('12820', '81311'), ('13317', '29'), ('12144', '81169'), ('13140', '16687'), ('13140', '79092'), ('13140', '78384'), ('13140', '48715'), ('13140', '54151'), ('13140', '13718'), ('13140', '4060'), ('13140', '9914'), ('13140', '32877'), ('13140', '9918'), ('13140', '4740'), ('13140', '47847'), ('13140', '29632'), ('13140', '72395'), ('13140', '48658'), ('13140', '78394'), ('13140', '4324'), ('13140', '4776'), ('13140', '78209'), ('13140', '51624'), ('13140', '66274'), ('13140', '38009'), ('13140', '80606'), ('13140', '13762'), ('13140', '28402'), ('13140', '13720'), ('13140', '9922'), ('13303', '81199'), ('13130', '70835'), ('13130', '7936'), ('13130', '30839'), ('13130', '11558'), ('13130', '32157'), ('13130', '2785'), ('13130', '9914'), ('13130', '73597'), ('13130', '9918'), ('13130', '49879'), ('13130', '62303'), ('13130', '64275'), ('13130', '48123'), ('13130', '8722'), ('13130', '43303'), ('13130', '39316'), ('13130', '78368'), ('13130', '28328'), ('13130', '57386'), ('13130', '30739'), ('13130', '9922'), ('13130', '71464'), ('13130', '50296'), ('12032', '65338'), ('13351', '81316'), ('13351', '16926'), ('13351', '80435'), ('13351', '79086'), ('12107', '16811'), ('12107', '70310'), ('12107', '10008'), ('12107', '25466'), ('12107', '36625'), ('12107', '81320'), ('12107', '48912'), ('12107', '62209'), ('12816', '79526'), ('12816', '79189'), ('13180', '39769'), ('13180', '81319'), ('12293', '70918'), ('12293', '59403'), ('12293', '76348'), ('12293', '12253'), ('12293', '65078'), ('12293', '61126'), ('12293', '12243'), ('12293', '12676'), ('12293', '11693'), ('12293', '78387'), ('12293', '54788'), ('12293', '26113'), ('12293', '50472'), ('12293', '50365'), ('12293', '66431'), ('12293', '29781'), ('12293', '50435'), ('12293', '48145'), ('12293', '79170'), ('12293', '76730'), ('12293', '13260'), ('12673', '29'), ('12672', '29'), ('13559', '9327'), ('12583', '25462'), ('12252', '50754'), ('12252', '11778'), ('12252', '38306'), ('12252', '48170'), ('12252', '5488'), ('12325', '78635'), ('12325', '4459'), ('12325', '68699'), ('12559', '80285'), ('12559', '78273'), ('12020', '48291'), ('12020', '4498'), ('12746', '48916'), ('13463', '56785'), ('13463', '47821'), ('13461', '80790'), ('13461', '4425'), ('12550', '48353')]

如果这些用户之间存在超过 1 个关系,我想做的是增加权重。因此,正如我在 JSON 关系中所展示的,subject_id 1 与 subject_id 2 有 3 个关系,因此它们的权重应该是 3,而用户 3 与 subject_id 1 只有 1 个关系,因此权重应该是 1。

更新:

我想我已经解决了我的问题:

def load(fname):
G = nx.MultiDiGraph()
d = simplejson.load(open(fname))
for item in d:
    for attribute, value in item.iteritems():
        if (value['subject_id'], value['object_id']) in G.edges():
            data = G.get_edge_data(value['subject_id'], value['object_id'], key='edge')
            G.add_edge(value['subject_id'], value['object_id'], key='edge', weight=data['weight']+1)
        else:
            G.add_edge(value['subject_id'], value['object_id'], key='edge', weight=1)

print G.edges(data=True)

但您的帮助仍然会有助于改进。

【问题讨论】:

  • 好吧,你的“解决方案”可能并不完全是一个解决方案。 MultiDiGraph 允许两个节点之间有多个边。您不是在修改前一条边的权重,而是在每次遇到一对时添加一条新边。你可能会澄清一件事,你在寻找有向图吗?您的解释提供了无向图,因为您认为(1,2) 关系与权重3 有两个1->2 关系和一个2->1 关系。
  • 你说得对,我完全忘记了节点之间的方向。您能提出一些解决方案来解决权重更新和有向图吗?

标签: python duplicates networkx edges


【解决方案1】:

您可以简单地使用weight 属性存储您的权重。您可以使用has_edge 方法检查是否存在边。结合这些会给你:

def load(fname):
    G = nx.DiGraph()
    d = simplejson.load(open(fname))
    for item in d:
        for attribute, value in item.iteritems():
            subject_id, object_id = value['subject_id'], value['object_id']
            if G.has_edge(subject_id, object_id):
                # we added this one before, just increase the weight by one
                G[subject_id][object_id]['weight'] += 1
            else:
                # new edge. add with weight=1
                G.add_edge(subject_id, object_id, weight=1)
    return G

【讨论】:

    猜你喜欢
    • 2021-10-12
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2014-05-22
    • 2013-07-12
    • 1970-01-01
    • 1970-01-01
    • 2018-08-14
    相关资源
    最近更新 更多