【发布时间】:2018-05-21 06:05:13
【问题描述】:
根据this question and answer可以将长列表转换为二进制数据帧。
但是,如何将它用于每个用户多次包含相同值的数据帧中?
数据框示例:
d_long <- data.frame( nameid = c("sally","sally","sally", "sally","Robert","annie","annie","annie"), value = c("product1","ra","ent","ra","ra","ra","product1","product1"))
nameid value 1 sally product1 2 sally ra 3 sally ent 4 sally ra 5 Robert ra 6 annie ra 7 annie product1 8 annie product1
预期的输出是这样的:
d_exist <- data.frame(nameid = c("sally","Robert","annie"), product1 = c(1,0,1), ra = c(1,1,1), ent = c(1,0,0))
nameid product1 ra ent 1 sally 1 1 1 2 Robert 0 1 0 3 annie 1 1 0
但是当我尝试这个时:
d_long %>% group_by(nameid, value) %>%
mutate(count = n()) %>%
ungroup() %>%
spread(value, count, fill = 0) %>%
as.data.frame()
我收到错误:
错误:行 (7, 8)、(2, 4) 的标识符重复
只用是不是合适
d_long[!duplicated(d_long), ]
【问题讨论】:
-
table(d_long)会让你更接近。 -
你需要一个按组排列的序列
-
这样的事情可以提供帮助,但我不确定
(table(d_long$nameid, d_long$value)> 0)+0