如何通过r中的第二列聚合和合并一列答案

【问题标题】：How to aggregate and merge one column by a second column in r如何通过r中的第二列聚合和合并一列
【发布时间】：2020-11-03 17:56:45
【问题描述】：

在我目前的研究中，我意外地经常遇到这个特殊问题。假设我有一个数据框，其中包含美国所有州的总消费量。我想使用县人口（我有）来估计县的消费（我没有）。人口数据通常以长格式排列，列分别代表县、州和人口。如果消费数据称为cons，人口数据框称为pop，我通常解决问题的算法是这样的：

#data
pop <- as.data.frame(rnorm(12)+4)
pop$county <- letters[10:21]
pop$state <- c("A", "A", "A", "A", "B", "B", "B", "C", "C", "C","C","C")
colnames(pop)[1] <- "pop"
cons <- as.data.frame(c(10^5, 4*10^4, 8*10^4))
colnames(cons) <- "cons"
cons$state <- c("A", "B", "C")


agg_pop <- aggregate(list(pop_state = pop$pop), by = list(state = pop$state), FUN = sum, na.rm = T) # aggregating population by state
pop <- merge(pop, agg_pop, by = "state") # Merging the state population with the county population data
pop$share <- pop$pop/pop$pop_state # Calculating each county's share of state population
pop <- merge(pop, cons, by = "state") # Merging consumption data onto population data
pop$estimated_cons <- pop$cons * pop$share # multiplying county's share of state population with state consumption

谁能想到一种更简单的方法来做到这一点，只使用一个或两个函数？

【问题讨论】：

您好！你能提供一个最小的可重现的例子吗？
您能分享一个可重现的数据示例吗？
@grouah 我尝试添加一个带有模拟数据的示例
你好@pkpkPPkafa，我的回答有用吗？如果是这样，请不要犹豫，接受答案。

标签： r merge aggregate

【解决方案1】：

你在寻找这样的东西吗？

使用dplyr：

require(dplyr)
pop %>% 
  left_join(cons) %>% 
  group_by(state) %>% 
  mutate(share_cons=cons*(pop/sum(pop)))

Joining, by = "state"
# A tibble: 12 x 5
# Groups:   state [3]
     pop county state   cons share_cons
   <dbl> <chr>  <chr>  <dbl>      <dbl>
 1  3.63 j      A     100000     23226.
 2  4.09 k      A     100000     26157.
 3  3.71 l      A     100000     23763.
 4  4.20 m      A     100000     26854.
 5  5.32 n      B      40000     14913.
 6  3.59 o      B      40000     10062.
 7  5.36 p      B      40000     15026.
 8  4.06 q      C      80000     16029.
 9  1.77 r      C      80000      6985.
10  4.45 s      C      80000     17568.
11  5.38 t      C      80000     21228.
12  4.61 u      C      80000     18190.

【讨论】：