【发布时间】:2016-04-23 21:05:53
【问题描述】:
我对一些整洁的行为感到困惑。我可以像这样取消嵌套单个响应:
library(tidyr)
resp1 <- c("A", "B; A", "B", NA, "B")
resp2 <- c("C; D; F", NA, "C; F", "D", "E")
resp3 <- c(NA, NA, "G; H; I", "H; I", "I")
data <- data.frame(resp1, resp2, resp3, stringsAsFactors = F)
tidy <- data %>%
transform(resp1 = strsplit(resp1, "; ")) %>%
unnest()
# Source: local data frame [6 x 3]
#
# resp2 resp3 resp1
# (chr) (chr) (chr)
# 1 C; D; F NA A
# 2 NA NA B
# 3 NA NA A
# 4 C; F G; H; I B
# 5 D H; I NA
# 6 E I B
但我需要在我的数据集中取消嵌套多个列,并且这些列具有不同数量的 NA。我试过这个,它抛出了一个错误:
data %>%
transform(resp1 = strsplit(resp1, "; "),
resp2 = strsplit(resp2, "; "),
resp3 = strsplit(resp3, "; ")) %>%
unnest()
# Error: All nested columns must have the same number of elements.
我预计上面的代码会给我与以下相同的输出:
# unnesting multiple response (desired output / is there a better way?)
data %>%
transform(resp1 = strsplit(resp1, "; ")) %>%
unnest() %>%
transform(resp2 = strsplit(resp2, "; ")) %>%
unnest() %>%
transform(resp3 = strsplit(resp3, "; ")) %>%
unnest()
# resp1 resp2 resp3
# (chr) (chr) (chr)
# 1 A C NA
# 2 A D NA
# 3 A F NA
# 4 B NA NA
# 5 A NA NA
# 6 B C G
# 7 B C H
# 8 B C I
# 9 B F G
# 10 B F H
# 11 B F I
# 12 NA D H
# 13 NA D I
# 14 B E I
我是 R 新手,但这感觉很笨拙,让我怀疑我是否在滥用我不应该滥用的东西。多次取消嵌套尝试失败是怎么回事?
【问题讨论】: