【问题标题】:Convert R data into a .GEXF format将 R 数据转换为 .GEXF 格式
【发布时间】:2017-10-29 21:55:26
【问题描述】:

我第一次尝试构建共同作者的 PubMed 出版物(226 条记录)的二分图。以下是输入文件的示例(只有一个 CSV 行):

11810598;Chêne G, Angelini E, Cotte L, Lang JM, Morlat P, Rancinan C, May T, Journot V, Raffi F, Jarrousse B, Grappin M, Lepeu G, Molina JM;2002;Mar;Role of long-term nucleoside-analogue therapy in lipodystrophy and metabolic disorders in human immunodeficiency virus-infected patients.

 

> InputFile = 'JMMolina_PubMed.csv'

    # Read the CSV input file into the initial JMMpubs data frame

> setwd('~/Dropbox/R')
> JMMpubs <- read.csv(file=InputFile , header =
> FALSE , sep = ";" , strip.white = TRUE) 

> names(JMMpubs) <- c("ID","AuthList", "Year", "Month", "Title")

    # build a new data frame IdAuth with one Id line for each coauthor
    # therefor the first article which has 13 co-authors will generate 13 lines with the same Id

> Authors <- strsplit(as.character(JMMpubs$AuthList), split = ", ")

> IdAuth <- data.frame(Id = rep(JMMpubs$ID, sapply(Authors,length)), Author = unlist(Authors))

    # Now I would like to export this data to Gephi

    # The nodes of the graph should be the UNIQUE names in Authors

> UniqueAuthors <- unique(unlist(Authors))

图形的边应该是IdAuth 的每一行。我想将出版物的年份JMMpubs$Year 与每个边缘相关联(将最近的边缘涂成红色,将较旧的边缘涂成较浅的色调)。

【问题讨论】:

    标签: r csv


    【解决方案1】:

    我也有类似的问题。我的解决方案如下。

    据我所知,您需要重新调整您的数据。 如果我理解正确,您需要与 ID 相关联的作者。 原始答案在 user1317221_G 的这篇帖子 https://stackoverflow.com/a/16177624/8080865

    我会将 DF 设置为:

    df3<-data.frame(Author = c("fawf", "ewew", "wewe", "wrewe", "zare")
                        ID= "11", "11", "11"... etc)´
    
    ###TNET solution WoRKS
    #create an identifier df for each author
    dfnames <- data.frame(i = as.numeric(df3$Id), 
                          value = df$author)
    
    library(tnet)
    tdf       <- as.tnet( cbind(df3[,1],df3[,2]), type="binary two-mode tnet")
    relations <- projecting_tm(tdf, method = "sum")
    
    
    # match original names
    relations[["i"]] <- dfnames[match(relations[['i']], dfnames[['']] ) , 'value']
    relations[["j"]] <- dfnames[match(relations[['j']], dfnames[['i']] ) , 'value']
    
    # clean up names
    names(relations) <- c("source" , "target", "weight")
    

    我希望这可以帮助您找出答案?

    【讨论】:

      猜你喜欢
      • 2023-02-25
      • 1970-01-01
      • 1970-01-01
      • 2019-04-13
      • 2021-11-16
      • 2013-10-22
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多