使用 Igraph 库确定中介中心性答案

【问题标题】：Determining Betweenness Centrality With The Igraph Library使用 Igraph 库确定中介中心性
【发布时间】：2011-12-16 05:01:27
【问题描述】：

我是一个非常非常平庸的程序员，但我仍然打算使用igraph python 库来确定用户在给定论坛中的中心性的影响，以预测他以后对该论坛的贡献。

我联系了使用NetworkX 库做类似事情的其他人，但鉴于论坛的当前规模，计算精确的中心性指数几乎是不可能的——这需要太多时间。

这是他的代码：

import networkx as netx
import sys, csv

if len(sys.argv) is not 2:
   print 'Please specify an input graph.'
   sys.exit(1)

ingraph = sys.argv[1]
graph = netx.readwrite.gpickle.read_gpickle(ingraph)

num_nodes = len(graph.nodes())
print '%s nodes found in input graph.' % num_nodes
print 'Recording data in centrality.csv'

# Calculate all of the betweenness measures
betweenness = netx.algorithms.centrality.betweenness_centrality(graph)
print 'Betweenness computations complete.'
closeness = netx.algorithms.centrality.closeness_centrality(graph)
print 'Closeness computations complete.'

outcsv = csv.writer(open('centrality.csv', 'wb'))

for node in graph.nodes():
   outcsv.writerow([node, betweenness[node], closeness[node]])

print 'Complete!'

我尝试用 igraph 库编写类似的东西（它允许进行快速估计而不是精确计算），但我似乎无法将数据写入 CSV 文件。

我的代码：

import igraph
import sys, csv

from igraph import *

graph = Graph.Read_Pajek("C:\karate.net")

print igraph.summary(graph)

estimate = graph.betweenness(vertices=None, directed=True, cutoff=2)
print 'Betweenness computation complete.'

outcsv = csv.writer(open('estimate.csv', 'wb'))

for v in graph.vs():
   outcsv.writerow([v, estimate[vs]])

print 'Complete!'

我在 igraph 文档中找不到如何调用单个顶点（或 NetworkX 行话中的节点），所以这就是我收到错误消息的地方）。也许我也忘记了其他事情；我可能太糟糕的程序员没有注意到：P

我做错了什么？

【问题讨论】：

我相信这就是在 igraph 中调用单个顶点的方法，@danihp！
vs 是图形对象的属性，其行为类似于顶点列表，因此将其用作estimate 的索引是不正确的（因为vs 本身就是一个未定义的变量）。您应该使用v.index 代替vs。

标签： python igraph

【解决方案1】：

因此，为了清楚起见，最终证明以下方法可以解决问题：

import igraph
import sys, csv

from igraph import *
from itertools import izip

graph = Graph.Read_GML("C:\stack.gml")

print igraph.summary(graph)

my_id_to_igraph_id = dict((v, k) for k, v in enumerate(graph.vs["id"]))

estimate = graph.betweenness(directed=True, cutoff=16)
print 'Betweenness computation complete.'

print graph.vertex_attributes()

outcsv = csv.writer(open('estimate17.csv', 'wb'))

outcsv.writerows(izip(graph.vs["id"], estimate))

print 'Complete!'

【讨论】：

嗨丹尼尔，我注意到你在这个例子中选择使用 cutoff=16，我想知道在选择截止值时是否有经验法则？我有 150 万个顶点和 1100 万条边，对选择的截止值有什么建议吗？

【解决方案2】：

您已经注意到，igraph 中的各个顶点是使用图形对象的vs 属性访问的。 vs 的行为类似于一个列表，因此对其进行迭代将产生图形的顶点。每个顶点由Vertex 类的一个实例表示，顶点的index 由它的index 属性给出。（请注意，igraph 对顶点和边都使用连续数字索引，因此您需要 index 属性并且不能直接使用原始顶点名称）。

我认为您需要的是最初存储在输入文件中的顶点的名称。名称存储在name 或id 顶点属性中（取决于您的输入格式），所以您需要的可能是这样的：

for v in graph.vs:
    outcsv.writerow([v["name"], estimate[v.index]])

请注意，通过索引顶点对象来访问顶点属性，就像它是一个字典一样。另一种方法是直接使用vs 对象作为字典；这将为您提供一个列表，其中包含所有顶点的给定顶点属性的值。例如：

from itertools import izip

for name, est in izip(graph.vs["name"], estimate):
    outcsv.writerow([name, est])

使用生成器表达式的更快版本：

outcsv.writerows(izip(graph.vs["name"], estimate))

【讨论】：

您应该决定是使用import igraph（在这种情况下您必须使用igraph.Graph.Read_Pajek 构建图表）还是from igraph import *（在这种情况下您只需要summary(graph)）。此外，您不需要import itertools，只需from itertools import izip。否则对我来说似乎很好。
所以，我理解完整代码是否正确：(...) graph = Graph.Read_Pajek("C:\karate.net") print igraph.summary(graph) estimate = graph.betweenness(vertices=None, directed=True, cutoff=2) print 'Betweenness computation complete.' outcsv = csv.writer(open('estimate.csv', 'wb')) outcsv.writerows(izip(graph.vs["name"], estimate)) print 'Complete!' 对不起，我似乎无法在 cmets 中换行。无论如何，@tamas，上面的代码在outcsv.writerows(izip(graph(vs["name"], estimate)) 中导致“KeyError：'属性不存在'”。有什么想法吗？
尝试使用"id" 而不是"name"，或者使用print graph.vertex_attributes() 列出顶点属性以确定哪个属性包含您需要的顶点名称。
使用"id" 代替名称会产生同样的错误；打印graph.vertex_attributes() 只返回[]。如果我遇到了相当愚蠢的情况，我很抱歉——我对此很陌生：P
看起来您的图形没有顶点属性，在这种情况下，您只能通过数字索引来识别顶点。所以，只需使用outcsv.writerows(enumerate(estimate))。