【发布时间】:2020-01-15 22:12:04
【问题描述】:
我有一个原始数据集。我正在努力做出我想要的输出。
原始数据集如下:
gender type neg_sentiment neu_sentiment pos_sentiment
1 M rep 7871 3454 7290
2 F rep 841 469 548
3 M rep 23 12 26
4 M rep 211 73 63
5 M rep 2587 868 1251
6 M rep 1273 606 594
7 M rep 374 150 260
8 M rep 30 23 138
9 M rep 95 30 23
10 M rep 22 22 121
使用这个,我想要的输出(以 sum 的示例值)如下所示:
gender neg_sentiment neu_sentiment pos_sentiment
M 10000 5000 3000
F 2000 500 7000
我所做的是:
df %>% group_by(gender) %>% summarise_all(sum)
df %>% group_by(type) %>% summarise_all(sum)
但它不起作用。
你能帮我做出想要的输出吗?
输出如下:
structure(list(gender = structure(c(2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("F", "M"), class = "factor"), type = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("rep", "sen"), class = "factor"), neg_sentiment = c(7871L, 841L, 23L, 211L, 2587L, 1273L, 374L, 30L, 95L, 22L), neu_sentiment = c(3454L, 469L, 12L, 73L, 868L, 606L, 150L, 23L, 30L, 22L), pos_sentiment = c(7290L,
548L, 26L, 63L, 1251L, 594L, 260L, 138L, 23L, 121L)), row.names = c(NA, 10L), class = "data.frame")
【问题讨论】:
-
你需要
df1 %>% group_by(gender) %>% summarise_if(is.numeric, sum)