【问题标题】:Calculating network statistics between attribute classes with igraph in R在 R 中使用 igraph 计算属性类之间的网络统计信息
【发布时间】:2020-06-02 09:51:55
【问题描述】:

我在 R 3.5.2 中使用 igraph 版本 1.2.4.2 来分析网络数据。顶点(节点)具有诸如“Sex”和“Age_class”之类的分类属性,而边缘是无向的和加权的。我导入了邻接矩阵并使用“set_vertex_attr”命令附加了顶点属性我想计算网络指标,例如不仅全局网络的介数和强度,而且还计算属性类之间和属性类内的介数,即加权连接的介数女性与女性或男性与女性之间。

我可以通过删除其他属性类的顶点来计算类内网络统计信息,例如

gMM <- delete.vertices(g, V(g)[Sex != 'M'])    # making a network of only males
betweenness(gMM, direction = F)    # calculating male-male only betweenness

但是,这种方法不适用于类间统计,不知道有没有人知道如何在igraph中计算类间统计,谢谢。

【问题讨论】:

    标签: r igraph social-networking network-analysis


    【解决方案1】:

    我还没有找到令人满意的方法(我记得)在 igraph 中做这种事情,所以我总是最终做类似以下的事情。

    首先,这是一些示例数据...

    library(igraph, warn.conflicts = FALSE); set.seed(831); n_nodes <- 12
    
    g <- random.graph.game(n_nodes, 0.2)
    vertex_attr(g) <- list(name = letters[seq_len(n_nodes)],
                           sex = sample(c("male", "female"), n_nodes, replace = TRUE))
    edge_attr(g) <- list(weight = sample(1:50, size = ecount(g)))
    g
    #> IGRAPH 8ef5eee UNW- 12 10 -- Erdos renyi (gnp) graph
    #> + attr: name (g/c), type (g/c), loops (g/l), p (g/n), name (v/c), sex
    #> | (v/c), weight (e/n)
    #> + edges from 8ef5eee (vertex names):
    #>  [1] b--c f--g c--h f--h a--i b--i f--j e--k i--k c--l
    

    ...这是一个提取仅包含同质或异质边的网络的函数...

    subgraph_edges_homophily <- function(graph, vattr_name, heterophily = FALSE,
                                         drop_isolates = FALSE) {
      stopifnot( # arg checks
        igraph::is.igraph(graph) || is.character(vattr_name) || 
          length(vattr_name) == 1L || !is.na(vattr_name) || 
          vattr %in% igraph::vertex_attr_names(vattr_name)
      )
    
      vattrs <- igraph::vertex_attr(graph, name = vattr_name)
      total_el <- igraph::as_edgelist(graph, names = FALSE)
    
      # rows from total_el where the attribute of the edge source == attribute of edge target
      edges_to_keep <- vattrs[total_el[, 1L]] == vattrs[total_el[, 2L]]
    
      # for heterophilous ties, just negate the "in_group" version
      if (heterophily) edges_to_keep <- !edges_to_keep
    
      igraph::subgraph.edges(graph, 
                             eids = which(edges_to_keep), 
                             delete.vertices = drop_isolates)
    }
    

    subgraph_edges_homophily() 将让您像这样提取您正在寻找的网络......

    # homophily
    subgraph_edges_homophily(g, vattr_name = "sex")
    #> IGRAPH 1bc4a38 UNW- 12 3 -- Erdos renyi (gnp) graph
    #> + attr: name (g/c), type (g/c), loops (g/l), p (g/n), name (v/c), sex
    #> | (v/c), weight (e/n)
    #> + edges from 1bc4a38 (vertex names):
    #> [1] e--k i--k c--l
    
    # heterophily
    subgraph_edges_homophily(g, vattr_name = "sex", heterophily = TRUE)
    #> IGRAPH e79e82d UNW- 12 7 -- Erdos renyi (gnp) graph
    #> + attr: name (g/c), type (g/c), loops (g/l), p (g/n), name (v/c), sex
    #> | (v/c), weight (e/n)
    #> + edges from e79e82d (vertex names):
    #> [1] b--c f--g c--h f--h a--i b--i f--j
    
    # no isolates
    subgraph_edges_homophily(g, vattr_name = "sex", drop_isolates = TRUE)
    #> IGRAPH 8ce3efe UNW- 5 3 -- Erdos renyi (gnp) graph
    #> + attr: name (g/c), type (g/c), loops (g/l), p (g/n), name (v/c), sex
    #> | (v/c), weight (e/n)
    #> + edges from 8ce3efe (vertex names):
    #> [1] e--k i--k c--l
    

    ...然后您可以根据需要在这些网络上运行指标。 这是一个计算类间指标的示例,就像你问的那样......

    g %>% 
      subgraph_edges_homophily(vattr_name = "sex", heterophily = TRUE) %>% 
      betweenness(directed = FALSE)
    #>  a  b  c  d  e  f  g  h  i  j  k  l 
    #>  0 10 12  0  0 11  0 12  6  0  0  0
    

    -

    sessionInfo()
    #> R version 3.6.2 (2019-12-12)
    #> Platform: x86_64-pc-linux-gnu (64-bit)
    #> Running under: Ubuntu 18.04.4 LTS
    #> 
    #> Matrix products: default
    #> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
    #> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
    #> 
    #> locale:
    #>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
    #>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
    #>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
    #>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
    #>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
    #> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
    #> 
    #> attached base packages:
    #> [1] stats     graphics  grDevices utils     datasets  methods   base     
    #> 
    #> other attached packages:
    #> [1] igraph_1.2.4.2
    #> 
    #> loaded via a namespace (and not attached):
    #>  [1] compiler_3.6.2  magrittr_1.5    tools_3.6.2     htmltools_0.4.0
    #>  [5] yaml_2.2.1      Rcpp_1.0.3      stringi_1.4.6   rmarkdown_2.1.1
    #>  [9] highr_0.8       knitr_1.28      stringr_1.4.0   xfun_0.12      
    #> [13] digest_0.6.24   pkgconfig_2.0.3 rlang_0.4.4     evaluate_0.14
    

    【讨论】:

    • 我认为包中有内置函数可以完成任务。非常感谢@kna​​pply,这很有帮助。
    【解决方案2】:

    我对@knapply 提供的解决方案进行了一些修改,因此该功能将提供 1) 班级内网络(例如男性-男性); 2)类间网络(男性-女性); 3)当属性有超过 2 个类别(例如年龄类别)时的 to-other-classes 网络。 以下是修改后的函数:

    
    ## Function - part1 ##
    
    subclass_edges <- function(graph, vattr_name){
      stopifnot( # arg checks
        igraph::is.igraph(graph) || is.character(vattr_name) || 
          length(vattr_name) == 1L || !is.na(vattr_name) || 
          vattr %in% igraph::vertex_attr_names(vattr_name)
      )
    
      vattrs <- igraph::vertex_attr(graph, name = vattr_name)
      vattrs_class <- unique(vattrs)
      total_el <- igraph::as_edgelist(graph, names = FALSE)
    
      # Attribute class to single attribute class
      list_name <- paste0("to_", vattrs_class)
      map(vattrs_class, function(x){
        map(1:length(vattrs_class), function(y){
          (vattrs[total_el[, 1L]] == x) & (vattrs[total_el[, 2L]] == vattrs_class[y])
        }) -> to_class
        names(to_class) <- list_name
        return(to_class)
      }) -> attr_class
      names(attr_class) <- vattrs_class
    
      if(length(vattrs_class) > 2){
        # Attribute class to all other attribute classes
        map(vattrs_class, function(x){
          (vattrs[total_el[, 1L]] == x) & (vattrs[total_el[, 2L]] != x)
        }) -> to_others
        names(to_others) <- vattrs_class
    
        # Combine
        map(1:length(vattrs_class), function(c){
          fin <- c(attr_class[[c]], to_others[c])
          names(fin) <- c(list_name, "to_others")
          return(fin)
        }) -> combind_edges
        names(combind_edges) <- vattrs_class
    
        edges_to_keep <- combind_edges
      } else {
        edges_to_keep <- attr_class
      }
    
      return(edges_to_keep)
    }
    
    
    ## Function - part2 ##
    
    subclass <- function(graph, vattr_name, drop_isolates = FALSE){
      subclass_edges(graph, vattr_name) -> input
      map(input, function(form){
        map(form, function(to){
          igraph::subgraph.edges(graph, 
                                 eids = which(to), 
                                 delete.vertices = drop_isolates)
        })
      })
    }
    
    

    这是一个从@knapply 的答案修改的示例,带有新属性“age_class”和更多节点(顶点):

    
    ## Example ##
    
    set.seed(100)
    n_nodes <- 20
    g <- random.graph.game(n_nodes, 0.2)
    vertex_attr(g) <- list(name = letters[seq_len(n_nodes)],
                           sex = sample(c("male", "female"), n_nodes, replace = TRUE), 
                           age_class = sample(c("15-20", "21-25", "26-30"), n_nodes, replace = TRUE))
    edge_attr(g) <- list(weight = sample(1:50, size = ecount(g)))
    g
    #> IGRAPH ce7c899 UNW- 20 44 -- Erdos renyi (gnp) graph
    #> + attr: name (g/c), type (g/c), loops (g/l), p (g/n), name (v/c), sex (v/c), age_class (v/c), weight (e/n)
    #> + edges from ce7c899 (vertex names):
    #> [1] b--c a--d b--e c--e b--f a--g e--g g--h f--i g--i a--j e--j a--k b--k h--k b--l h--l k--l c--m f--m l--m i--n m--n b--o g--o
    #> [26] k--o b--p f--p h--p c--q p--q f--r k--r n--r p--r b--s h--s m--s n--s p--s q--s i--t k--t n--t
    
    
    g %>% subclass(vattr_name = "age_class") -> g_a
    
    g_a$`15-20`$`to_26-30` %>% igraph::betweenness(directed = F) 
    # betweenness of indviduals in '15-20' age class with individuals in '26-30' age class
    #> a  b  c  d  e  f  g  h  i  j  k  l  m  n  o  p  q  r  s  t 
    #> 0  9  0  0  0 15 10  0 11  0  9  0  0  0 18  9  0 18  0  0 
    
    g_a$`15-20`$to_others %>% igraph::betweenness(directed = F) 
    # betweenness of indviduals in '15-20' age class with individuals in all age classes except '15-20'
    #> a  b  c  d  e  f  g  h  i  j  k  l  m  n  o  p  q  r  s  t 
    #> 0 45  0  0  0 16 32  0 16  0 21 21  0  0 34 18  0 16 10  0 
    
    
    

    希望这对有类似问题的人有所帮助。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-01-15
      • 2017-02-15
      • 2017-11-06
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多