在 R 中，如何将 group by 用于多列？答案

【问题标题】：In R, how can I use group by for multiple column?在 R 中，如何将 group by 用于多列？
【发布时间】：2020-01-15 22:12:04
【问题描述】：

我有一个原始数据集。我正在努力做出我想要的输出。

原始数据集如下：

    gender type neg_sentiment neu_sentiment  pos_sentiment
1       M  rep          7871          3454          7290
2       F  rep           841           469           548
3       M  rep            23            12            26
4       M  rep           211            73            63
5       M  rep          2587           868          1251
6       M  rep          1273           606           594
7       M  rep           374           150           260
8       M  rep            30            23           138
9       M  rep            95            30            23
10      M  rep            22            22           121

使用这个，我想要的输出（以 sum 的示例值）如下所示：

gender neg_sentiment    neu_sentiment     pos_sentiment
  M      10000             5000              3000
  F      2000               500              7000

我所做的是：

df %>% group_by(gender) %>% summarise_all(sum)
df %>% group_by(type) %>% summarise_all(sum)

但它不起作用。

你能帮我做出想要的输出吗？

输出如下：

structure(list(gender = structure(c(2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("F", "M"), class = "factor"), type = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("rep", "sen"), class = "factor"), neg_sentiment = c(7871L, 841L, 23L, 211L, 2587L, 1273L, 374L, 30L, 95L, 22L), neu_sentiment = c(3454L, 469L, 12L, 73L, 868L, 606L, 150L, 23L, 30L, 22L), pos_sentiment = c(7290L, 
548L, 26L, 63L, 1251L, 594L, 260L, 138L, 23L, 121L)), row.names = c(NA, 10L), class = "data.frame")

【问题讨论】：

你需要df1 %>% group_by(gender) %>% summarise_if(is.numeric, sum)

标签： r group-by

【解决方案1】：

我们可以用summarise_if选择数字列

library(dplyr)
df1 %>% 
     group_by(gender) %>%
     summarise_if(is.numeric, sum)
     #or with summarise_at
     #summarise_at(vars(ends_with('sentiment')), sum)

【讨论】：