【问题标题】:Creating similarity matrix of individual lifespans in R在 R 中创建个体寿命的相似性矩阵
【发布时间】:2014-08-06 13:10:32
【问题描述】:

我认为这真的很简单,但是当我确定存在一个现成的功能时,一些谷歌搜索和翻阅 R 书并没有发现一个现成的功能来实现这一点。

我有一个包含 3 列的向量:个人标签、出生日期和死亡日期。我想创建一个方阵,其中每个单元格是两个人都活着的天数。

所以从这里:

ID Birth Death
A    1/5    5/5
B    2/5   30/5
C   10/5   31/5

到这里:

    A    B    C
A   4    3    0
B   3   28   20
C   0   20   21

什么是对角线(如此自重叠)在很大程度上无关紧要,因为在以后的分析中我可以忽略它。

为任何帮助干杯

【问题讨论】:

    标签: r matrix overlap similarity


    【解决方案1】:

    所以一个朋友想出了这个办法,虽然有点长:

    data<-read.csv("data.csv",header=TRUE) 
    life_start<-as.Date(data$born,"%d/%m/%Y") # Get born column
    life_finish<-as.Date(data$die,"%d/%m/%Y") # Get death column
    lifespan<-as.numeric(life_finish-life_start) # Calculate lifespan
    data<-cbind(data,lifespan) # Add lifespan to data
    data
    overlap_matrix<-matrix(vector(),length(data$ID),length(data$ID)) # Create empty matrix for overlap values
    rownames(overlap_matrix)<-paste(data$ID) # Name rows with ID
    colnames(overlap_matrix)<-paste(data$ID) # Name columns with ID
    
    for (i in 1:length(data$ID)) { # Loop through every ID and call it i (the focal individual)
    
      born_a<-as.Date(data$born[i],"%d/%m/%Y") # Get birthday for focal individual
      die_a<-as.Date(data$die[i],"%d/%m/%Y") # Get deathday for focal individual
      lifespan_a<-data$lifespan[i] # Get lifespan for focal individual
    
      diag<-0+i # Calculate matrix ID of the diagonal value so the matrix isn't mirrored
    
      for (j in diag:length(data$ID)) { # Loop through every ID and call it j (to compare to i)
    
        born_b<-as.Date(data$born[j],"%d/%m/%Y") # Get birthday for comparison individual
        die_b<-as.Date(data$die[j],"%d/%m/%Y") # Get deathday for comparison individual
        lifespan_b<-data$lifespan[j] # Get lifespan for comparison individual
    
        if ((born_a <= die_b) && (die_a >= born_b)) { # If the focal individual was born before the comparison died, and died after the comparison was born (i.e. if there was overlap)
    
          born_diff<-as.numeric(born_a-born_b) # Calculate difference between birthdays
          die_diff<-as.numeric(die_a-die_b) #  Calculate difference between deathdays
    
          if(born_diff <= 0) { # If the focal individual was born before the comparison individual
            overlap<-lifespan_a+born_diff # Trim the lifespan of the focal indiviual by the difference (and add one to match days rather than the difference)
          } else {
            overlap<-lifespan_a # Start overlap as the lifespan of the focal individual
          }
    
          if(die_diff > 0) { # If focal individual died after the comparison individual
            overlap<-overlap-die_diff # Trim the lifespan of the focal individual by the difference (and add one to match the days rather than the difference)
          }
    
          overlap_matrix[i,j]<-overlap # Shunt overlap value to the overlap matrix
    
        } else { # No overlap!
          overlap_matrix[i,j]<-0 # Shunt the zero
        }
    
      }
    
    }
    overlap_matrix
    

    “sna”包也有一个叫做“interval.graph”的函数来完成这项工作。

    【讨论】:

      猜你喜欢
      • 2022-01-12
      • 1970-01-01
      • 1970-01-01
      • 2011-05-09
      • 2012-05-27
      • 1970-01-01
      • 1970-01-01
      • 2021-01-09
      相关资源
      最近更新 更多