【发布时间】:2018-07-01 09:18:46
【问题描述】:
我有以下数据框:
location asset_status count row
<chr> <chr> <dbl> <int>
1 location1 Owned 1 1
2 location1 Available 1 2
3 location1 Owned 1 3
4 location2 Owned 1 4
5 location2 Owned 1 5
6 location2 Owned 1 6
7 location2 Owned 1 7
8 location2 no status 1 8
9 location3 Owned 1 9
10 location3 Owned 1 10
当我尝试使用它进行传播时,我收到以下错误:
df <- head(us_can_laptops,10) %>%
select(location,asset_status,count) %>%
#mutate(row = row_number()) %>% #excluded
group_by(location) %>%
spread(asset_status,count)
Error: Duplicate identifiers for rows (4, 5, 6, 7), (1, 3)
因此,根据 SO 上与此相关的其他问题,我添加了一个带有 mutate 的唯一标识符:
df <- head(us_can_laptops,10) %>%
select(location,asset_status,count) %>%
mutate(row = row_number()) %>%
group_by(location) %>%
spread(asset_status,count)
但这会返回:
location row Available `no status` Owned
* <chr> <int> <dbl> <dbl> <dbl>
1 location2 4 NA NA 1
2 location2 5 NA NA 1
3 location2 6 NA NA 1
4 location2 7 NA NA 1
5 location2 8 NA 1 NA
6 location3 10 NA NA 1
7 location3 9 NA NA 1
8 location1 1 NA NA 1
9 location1 2 1 NA NA
10 location1 3 NA NA 1
此外,每当我尝试汇总调用时,它都会破坏我的传播。
这是期望的结果:
location Available `no status` Owned
* <chr> <dbl> <dbl> <dbl>
1 location1 1 NA 2
2 location2 NA 1 4
3 location3 NA NA 2
任何帮助将不胜感激。我知道这看起来像重复,但以下链接问题的答案仍然无法为我解决问题: Spread function Error: Duplicate identifiers for rows [duplicate] Spread with duplicate identifiers for rows 1
我真的在寻找使用 dplyr 的解决方案,而不是 dcast
【问题讨论】:
-
在预期的输出中,第一行的 4 的值是从哪里来的?
-
哦,请忽略行列。那只是柜台。我将在实际代码中删除它。