来自多个 TPM 的中心性计算答案

【问题标题】：Centrality calculations from multiple TPMs来自多个 TPM 的中心性计算
【发布时间】：2021-09-10 20:56:32
【问题描述】：

大脑需要。我不知道这是否可以用 igraph 解决。基本上：

a.) 从我的数据中，我想为每个 id 创建 TPM（参见示例代码）

b.) 我想为每个 TPM 创建一个有向图

c.) 计算特定节点的介数（在我的示例中是 1 和 5）

d.) 根据所需节点之间的 id 在单独的文件中返回

如何为超过 1000 TPM 的大型数据集执行此操作？

一些类似的topic

期望的输出：

数据结构：

示例代码：

Transition matrix creation:


lapply(seq_len(nrow(stack)),
       function(i) {
         tmp <- trans.matrix(as.matrix(stack[i, 2:6]))
         write.csv(tmp, file =  paste0(i, ".csv"), quote = FALSE)
       })

每个 id 的结果 TPM，每个 df 代表每个 id 的 TPM

df1<-structure(list(X1 = c(1, 2, 3, 4), `2` = c(1, 0, 0, 0), `3` = c(0, 
1, 0, 0), `4` = c(0, 0, 1, 0), `5` = c(0, 0, 0, 1)), class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -4L), spec = structure(list(
    cols = list(X1 = structure(list(), class = c("collector_double", 
    "collector")), `2` = structure(list(), class = c("collector_double", 
    "collector")), `3` = structure(list(), class = c("collector_double", 
    "collector")), `4` = structure(list(), class = c("collector_double", 
    "collector")), `5` = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1L), class = "col_spec"))

df2<-structure(list(X1 = c(0, 7, 8, 9), `6` = c(0, 1, 0, 0), `7` = c(0, 
0, 1, 0), `8` = c(0, 0, 0, 1), `9` = c(1, 0, 0, 0)), class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -4L), spec = structure(list(
    cols = list(X1 = structure(list(), class = c("collector_double", 
    "collector")), `6` = structure(list(), class = c("collector_double", 
    "collector")), `7` = structure(list(), class = c("collector_double", 
    "collector")), `8` = structure(list(), class = c("collector_double", 
    "collector")), `9` = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1L), class = "col_spec"))

df3<-structure(list(X1 = c(10, 14, 22, 23), `14` = c(0, 0, 0, 1), 
    `22` = c(1, 0, 0, 0), `23` = c(0, 0, 1, 0), `25` = c(0, 1, 
    0, 0)), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L), spec = structure(list(cols = list(
    X1 = structure(list(), class = c("collector_double", "collector"
    )), `14` = structure(list(), class = c("collector_double", 
    "collector")), `22` = structure(list(), class = c("collector_double", 
    "collector")), `23` = structure(list(), class = c("collector_double", 
    "collector")), `25` = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), skip = 1L), class = "col_spec"))

df4<-structure(list(X1 = c(1, 2, 13), `1` = c(0, 0.5, 1), `2` = c(1, 
0, 0), `13` = c(0, 0.5, 0)), class = c("spec_tbl_df", "tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -3L), spec = structure(list(
    cols = list(X1 = structure(list(), class = c("collector_double", 
    "collector")), `1` = structure(list(), class = c("collector_double", 
    "collector")), `2` = structure(list(), class = c("collector_double", 
    "collector")), `13` = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1L), class = "col_spec"))

df5<--structure(list(X1 = c(1, 2), `1` = c(0, 0.333333333333333), `2` = c(1, 
0.333333333333333), `5` = c(0, 0.333333333333333)), class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -2L), spec = structure(list(
    cols = list(X1 = structure(list(), class = c("collector_double", 
    "collector")), `1` = structure(list(), class = c("collector_double", 
    "collector")), `2` = structure(list(), class = c("collector_double", 
    "collector")), `5` = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1L), class = "col_spec"))



Sample data:

    stack<-structure(list(X1 = c(1, 2, 3, 4, 5), a = c(1, 0, 10, 2, 2), 
        b = c(2, 9, 22, 13, 2), c = c(3, 8, 23, 1, 1), d = c(4, 7, 
        14, 2, 2), e = c(5, 6, 25, 1, 5)), class = c("spec_tbl_df", 
    "tbl_df", "tbl", "data.frame"), row.names = c(NA, -5L), spec = structure(list(
        cols = list(X1 = structure(list(), class = c("collector_double", 
        "collector")), a = structure(list(), class = c("collector_double", 
        "collector")), b = structure(list(), class = c("collector_double", 
        "collector")), c = structure(list(), class = c("collector_double", 
        "collector")), d = structure(list(), class = c("collector_double", 
        "collector")), e = structure(list(), class = c("collector_double", 
        "collector"))), default = structure(list(), class = c("collector_guess", 
        "collector")), skip = 1L), class = "col_spec"))

样本数据

【问题讨论】：

@ThomasIsCoding 感谢堆栈提供基于此的 TPM 我需要图表和中间性
@ThomasIsCoding 堆栈是一个数据帧列，表示数据帧的测量值。我所追求的价值观本身。堆栈的重要性在于 id 也有几种不同的度量。
@ThomasIsCoding 谢谢，df1,df2, df3, df4, df5 是使用示例代码根据堆栈数据创建的。他们根据 id 重新表示 TPM。之后，TPM 被引入到另一个易于计算介数的软件中

标签： r dataframe igraph

【解决方案1】：

一个可能的igraph 选项

# interested vertices in all graphs
v <- c("1", "5")
data.frame(
    t(
        list2DF(
            lapply(
                # get all `df`s in the global environment and save in a list
                mget(ls(pattern = "^df\\d+")),
                function(x) {
                    # row-column indices for non-zero values
                    inds <- data.frame(which(as.matrix(x[-1]) != 0, arr.ind = TRUE))
                    # replace values in `inds` by row or col names
                    df <- transform(
                        inds,
                        row = x$X1[row],
                        col = names(x[-1])[col]
                    )
                    # create graph object
                    g <- graph_from_data_frame(df)
                    # if the interested vertex shows up in the graph, then we calculate its betweenness centrality; otherwise, return NA
                    sapply(v, function(z) {
                        if (z %in% names(V(g))) {
                            betweenness(g, z, normalized = TRUE)
                        } else {
                            NA
                        }
                    })
                }
            )
        )
    ),
    check.names = FALSE
)

给予

      1  2
df1 0.0  0
df2  NA NA
df3  NA NA
df4 0.5 NA
df5 0.0  0

【讨论】：

g 谢谢你有空可以请你添加一些commdnts/解释。
@user11418708 我添加了 cmets。
非常感谢您抽出宝贵时间，因为我看到 igraph 不考虑 0.5 等十进制值。这个可以调整吗？更具体地说是 Df4
非常感谢您有空的时候可以用“betweenness(g, z, normalized = T)”更新“betweenness(g, z)”行吗
非常感谢您抽出宝贵的时间 - 还有一些问题是否可以按降序排列结果？例如，如果我读入超过 11 个 df，则顺序将是 df1 df11 df2 df3...而不是 df1 df2 df3...我尝试添加 order 函数但成功率较低