【发布时间】:2012-09-07 04:51:30
【问题描述】:
我有数据框
test <- structure(list(
y2002 = c("freshman","freshman","freshman","sophomore","sophomore","senior"),
y2003 = c("freshman","junior","junior","sophomore","sophomore","senior"),
y2004 = c("junior","sophomore","sophomore","senior","senior",NA),
y2005 = c("senior","senior","senior",NA, NA, NA)),
.Names = c("2002","2003","2004","2005"),
row.names = c(c(1:6)),
class = "data.frame")
> test
2002 2003 2004 2005
1 freshman freshman junior senior
2 freshman junior sophomore senior
3 freshman junior sophomore senior
4 sophomore sophomore senior <NA>
5 sophomore sophomore senior <NA>
6 senior senior <NA> <NA>
我想调整数据以仅获取每一行的各个步骤,如
result <- structure(list(
y2002 = c("freshman","freshman","freshman","sophomore","sophomore","senior"),
y2003 = c("junior","junior","junior","senior","senior",NA),
y2004 = c("senior","sophomore","sophomore",NA,NA,NA),
y2005 = c(NA,"senior","senior",NA, NA, NA)),
.Names = c("1","2","3","4"),
row.names = c(c(1:6)),
class = "data.frame")
> result
1 2 3 4
1 freshman junior senior <NA>
2 freshman junior sophomore senior
3 freshman junior sophomore senior
4 sophomore senior <NA> <NA>
5 sophomore senior <NA> <NA>
6 senior <NA> <NA> <NA>
我知道如果我将每一行视为一个向量,我可以做类似的事情
careerrow <- c(1,2,3,3,4)
pairz <- lapply(careerrow,function(i){c(careerrow[i],careerrow[i+1])})
uniquepairz <- careerrow[sapply(pairz,function(x){x[1]!=x[2]})]
我的困难是将该行应用到我的数据表中。我认为 lapply 是要走的路,但到目前为止我无法解决这个问题。
【问题讨论】:
-
你需要它是一个有效的 data.frame 填充 NA 值还是与每个 ID 关联的列表就足够了?
-
我想计算相同的行,所以我认为能够将其作为有效的 data.frame 是一件好事。或者列表列表是否可以方便地执行此类计数?
标签: r dataframe data.table lapply