【发布时间】:2020-07-15 17:42:22
【问题描述】:
这是我的虚假数据:
#> id column
#> 1 blue, red, dog, cat
#> 2 red, blue, dog
#> 3 blue
#> 4 red
#> 5 dog, cat
#> 6 cat
#> 7 red, cat
#> 8 dog
#> 9 cat, red
#> 10 blue, cat
例如,我想告诉 R dog and cat = animal 和 red and blue = colour。我想基本上计算动物、颜色和两者的数量(以及最终百分比)。
#> id column newcolumn
#> 1 blue, red, dog, cat both
#> 2 red, blue, dog both
#> 3 blue colour
#> 4 red colour
#> 5 dog, cat animal
#> 6 cat animal
#> 7 red, cat both
#> 8 dog animal
#> 9 cat, red both
#> 10 blue, cat both
到目前为止,我只能通过执行以下操作来合计红色、蓝色、狗和猫的数量:
column.string<-paste(df$column, collapse=",")
column.vector<-strsplit(column.string, ",")[[1]]
column.vector.clean<-gsub(" ", "", column.vector)
table(column.vector.clean)
非常感谢您的帮助,这是我的虚假数据示例:
df <- data.frame(id = c(1:10),
column = c("blue, red, dog, cat", "red, blue, dog", "blue", "red", "dog, cat", "cat", "red, cat", "dog", "cat, red", "blue, cat"))
【问题讨论】:
标签: r