【发布时间】:2020-04-15 18:28:39
【问题描述】:
有没有什么方法可以更有效地做到这一点?我想创建一个项目类型的列。每个参与者都有不同数量的项目,所以这真的很棘手。这是我的数据的一个玩具示例
structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L), condition = c("high", "high", "high", "high", "high",
"high", "high", "high", "medium", "medium", "medium", "medium",
"medium", "medium", "medium", "low", "low", "low", "low", "low",
"low", "low", "low", "low", "low", "low", "low", "low", "low",
"low", "high", "high", "high", "high", "high", "high", "high",
"medium", "medium", "medium", "medium", "medium", "medium", "medium"
), item = c("abcde", "bcdef", "cdefgh", "defgh", "efghi", "fghijk",
"ghijkl", "hijklm", "1234", "2345", "3456", "4567", "5678", "6789",
"7890", "onion", "celery", "tomato", "carrot", "green bean",
"lettuce", "garlic", "abcde", "bcdef", "cdefgh", "defgh", "efghi",
"fghijk", "ghijkl", "hijklm", "onion", "celery", "tomato", "carrot",
"green bean", "lettuce", "garlic", "1234", "2345", "3456", "4567",
"5678", "6789", "7890")), row.names = c(NA, -44L), class = c("tbl_df",
"tbl", "data.frame"))
这是我迄今为止所做的,但这是一场噩梦,因为我有一百多个不同的项目:
df$subs <- 0
df$subs[df$item=="abcde"] <- "A"
df$subs[df$item=="bcdef"] <- "A"
df$subs[df$item=="cdefg"] <- "A"
df$subs[df$item=="defgh"] <- "A"
df$subs[df$item=="efghi"] <- "A"
df$subs[df$item=="12345"] <- "B"
df$subs[df$item=="23456"] <- "B"
df$subs[df$item=="34567"] <- "B"
df$subs[df$item=="45678"] <- "B"
df$subs[df$item=="56789"] <- "B"
df$subs[df$item=="onion"] <- "C"
df$subs[df$item=="celery"] <- "C"
df$subs[df$item=="tomato"] <- "C"
df$subs[df$item=="carrot"] <- "C"
df$subs[df$item=="green bean"] <- "C"
tidyverse 有更快的方法吗?
【问题讨论】:
-
这不使用 tidyverse,它可能仍然不是最佳解决方案,但您可以执行类似
x <- c("abcde", "bcdef", "cdefg", "defgh", "efghi")后跟df$subs[df$item %in% x] <- "A"的操作。它至少可以让您不必为要匹配的每个值写一行。 -
如果 item-LETTER 映射没有明确的公式,很难看出如何使用代码进行分配。如果您必须手动完成分配,则在 Excel 中完成,然后将完成的数据框导入 R 可能会更容易。对于 Excel 中的 LETTERS 列,您可以使用数据验证功能并创建允许值的列表。这将为每个单元格提供一个选择下拉菜单,以使输入更容易。此外,每次完成 LETTER 映射时,您都可以对列进行排序,以使所有尚未分配的单元格保持连续。