【问题标题】:How do I reshape the following data frame in R?如何在 R 中重塑以下数据框?
【发布时间】:2017-01-06 18:10:06
【问题描述】:

我有一个如下所示的数据集。我正在尝试编写 R 代码来转换它。这是自我网络,这意味着第一列有两个人列出了他们的联系(在 A1、A2 和 A3 列中)。然后在第 5 到 10 列中,我有 A1、A2 和 A3 中的人之间的相互关系:

d <- data.frame(matrix(c("Steph","Ellen","John","Jim","Sam","Tom","Sally","Jane","Sam","Jane","Sally","NA","John","Jim","NA","Jane","Sam","NA","NA","Tom"),2,10))
names(d)<-c("Ego","A1","A2","A3","A1Connection1","A1Connection2","A2Connection1","A2Connection2","A3Connection1","A3Connection2")
d

我的挑战是从第 2 列到第 10 列,让它们看起来像这样

ReshapedData<-data.frame(matrix(c("John","John","Sam","Sam","Sally","Sally","Jim","Jim","Tom","Tom","Jane","Jane",
            "Sam","Sally","John","NA","Sam","NA","Jane","NA","Jim","Jane","NA","Tom"),12,2))
names(ReshapedData)<-c("Alter", "Alter_Alter")
ReshapedData

我不需要自我的名字,至少在这个阶段。关键是先得到其他的东西。到目前为止,我能想到的最好的事情是在每行中转置第 5-10 列,然后使用 rbind 创建一个长列,然后使用 A1、A2、A3 中的更改列表对其进行 cbind。那必须是一些更简化的方式来管理它。

谢谢

波格丹

【问题讨论】:

标签: r reshape


【解决方案1】:

使用 reshape 包中的 melt() 函数并匹配具有公共索引的项目:

d <- data.frame(matrix(c("Steph","Ellen","John","Jim","Sam","Tom","Sally","Jane","Sam","Jane","Sally","NA","John","Jim","NA","Jane","Sam","NA","NA","Tom"),2,10))
names(d)<-c("Ego","A1","A2","A3","A1Connection1","A1Connection2","A2Connection1","A2Connection2","A3Connection1","A3Connection2")
d

library(reshape)

a <- melt(d,id.vars=NULL,measure.vars = c("A1","A2","A3"))
a$match <- as.character(paste(a[,1],rep(1:2)))
b <- melt(d,id.vars=NULL,measure.vars = c(5:dim(df)[2]))
b$match <- as.character(paste(gsub(pattern = ".*A([0-9]+).*",replacement = "A\\1",x = b[,1]),
                              rep(1:2)))

df.final <- data.frame(Alter=a$value[match(b$match,a$match)], Alter_Alter=b$value)

index <- 1:dim(df.final)[1]

index <- matrix(1:dim(df.final)[1], nrow = dim(df.final)[1]/2,byrow = T)

df.final <- df.final[as.vector(index),]

df.final
   Alter Alter_Alter
1   John         Sam
3   John       Sally
5    Sam        John
7    Sam          NA
9  Sally         Sam
11 Sally          NA
2    Jim        Jane
4    Jim          NA
6    Tom         Jim
8    Tom        Jane
10  Jane          NA
12  Jane         Tom

# Test

ReshapedData<-data.frame(matrix(c("John","John","Sam","Sam","Sally","Sally","Jim","Jim","Tom","Tom","Jane","Jane",
            "Sam","Sally","John","NA","Sam","NA","Jane","NA","Jim","Jane","NA","Tom"),12,2))
names(ReshapedData)<-c("Alter", "Alter_Alter")

df.final==ReshapedData

   Alter Alter_Alter
1   TRUE        TRUE
3   TRUE        TRUE
5   TRUE        TRUE
7   TRUE        TRUE
9   TRUE        TRUE
11  TRUE        TRUE
2   TRUE        TRUE
4   TRUE        TRUE
6   TRUE        TRUE
8   TRUE        TRUE
10  TRUE        TRUE
12  TRUE        TRUE

【讨论】:

  • 非常感谢。完美!
  • 我看到了一个问题。在第 1 列中的重构数据框中 John 应该有两个连接 Sam 和 Sally。在 df.final 中,它拉动了 Sam 和 Jane,因此沿着列向下而不是穿过第 5 列和第 6 列。这是否可以通过索引轻松修复?
猜你喜欢
  • 1970-01-01
  • 2017-02-11
  • 1970-01-01
  • 2014-03-10
  • 1970-01-01
相关资源
最近更新 更多