【问题标题】:dplyr::summarise pull the value based on another column maxdplyr::summarise 根据另一列 max 拉取值
【发布时间】:2020-12-17 23:14:45
【问题描述】:

基于以下可重现的代码,如何根据max(LeastNEmployees) 有条件地添加Address 列:

dat_url <- "https://gender-pay-gap.service.gov.uk/viewing/download-data/2019"
dat <- read_csv(dat_url)

#2 convert EmployerSize
df = data.frame(EmployerSize=c('Less than 250','250 to 499', '500 to 999', '1000 to 4999', '5000 to 19,999', '20,000 or more'),
               LeastNEmployees = c(1,250,500, 1000, 5000, 20000))

a <- dat %>% 
   left_join(df, c('EmployerSize' = 'EmployerSize')) %>% 
   group_by(ResponsiblePerson) %>% 
   summarize(
     across(where(is.numeric) & !starts_with("Least"), mean),
     across(c("EmployerName","SicCodes"), ~toString(.x)),
     LeastNEmployees = max(LeastNEmployees))
     

【问题讨论】:

    标签: r dplyr


    【解决方案1】:

    这是一个使用which 条件的方法。

    a <- dat %>% 
      left_join(df, c('EmployerSize' = 'EmployerSize')) %>% 
      group_by(ResponsiblePerson) %>% 
      summarize(
        across(where(is.numeric) & !starts_with("Least"), mean),
        across(c("EmployerName","SicCodes"), ~toString(.x)),
        LeastNEmployees = max(LeastNEmployees),
        Address = Address[which(LeastNEmployees == max(LeastNEmployees))])
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2021-03-25
      • 2015-07-15
      • 1970-01-01
      • 1970-01-01
      • 2022-11-10
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多