【问题标题】:left_join in a for loop with different columns namesleft_join 在具有不同列名的 for 循环中
【发布时间】:2020-03-24 05:04:37
【问题描述】:

我有一个名为a 的data.frame,其结构类似于:-

a <- data.frame(X1=c("A", "B", "C", "A", "C", "D"),
                X2=c("B", "C", "D", "A", "B", "A"),
                X3=c("C", "D", "A", "B", "A", "B")
               )

我还有另一套是:-

b <- data.frame(Xn=c("A", "B", "C", "D"),
                Feature=c("some", "more", "what", "why"))

我想将集合b中的所有Features添加到集合a,这样X1X2X3在集合a中都有对应的特征列。换句话说,集合a 中的列变为:-

colnames(a) <- c("X1", "X2", "X3", "Features1", "Features2", "Features3")

如何在 for 循环中使用 left_join 来做到这一点?

【问题讨论】:

    标签: r for-loop left-join cran


    【解决方案1】:

    在base R中,我们可以unlista数据帧和match它与b$Xn得到对应的Feature值。我们可以cbind这个数据框到原始数据框得到最终答案。

    temp <- a
    temp[] <- b$Feature[match(unlist(temp), b$Xn)]
    names(temp) <- paste0('Feature', seq_along(temp))
    cbind(a, temp)
    
    #  X1 X2 X3 Feature1 Feature2 Feature3
    #1  A  B  C     some     more     what
    #2  B  C  D     more     what      why
    #3  C  D  A     what      why     some
    #4  A  A  B     some     some     more
    #5  C  B  A     what     more     some
    #6  D  A  B      why     some     more
    

    tidyverse 中,我们可以获取长格式的数据,将数据连接起来并恢复为宽格式。

    library(dplyr)
    library(tidyr)
    
    a %>%
      mutate(row = row_number()) %>%
      pivot_longer(cols = -row) %>%
      left_join(b, by = c('value' = 'Xn'))  %>%
      select(-value) %>%
      pivot_wider(names_from = name, values_from = Feature) %>%
      select(-row) %>%
      rename_all(~paste0('Feature', seq_along(.))) %>%
      bind_cols(a, .)
    

    【讨论】:

      【解决方案2】:

      这可以通过使用mutate_allrecode a 中的所有列来完成:

      library(tidyverse)
      
      a %>% 
        mutate_all(funs(feat=recode(., !!!set_names(as.character(b$Feature), b$Xn))))
      
        X1    X2    X3    X1_feat X2_feat X3_feat
      1 A     B     C     some    more    what   
      2 B     C     D     more    what    why    
      3 C     D     A     what    why     some   
      4 A     A     B     some    some    more   
      5 C     B     A     what    more    some   
      6 D     A     B     why     some    more
      

      您可以添加rename_at 以获得所需的名称:

      a %>% 
        mutate_all(funs(f=recode(., !!!set_names(as.character(b$Feature), b$Xn)))) %>% 
        rename_at(vars(matches("f")), ~gsub(".([0-9]).*", "Feature\\1", .))
      
        X1 X2 X3 Feature1 Feature2 Feature3
      1  A  B  C     some     more     what
      2  B  C  D     more     what      why
      3  C  D  A     what      why     some
      4  A  A  B     some     some     more
      5  C  B  A     what     more     some
      6  D  A  B      why     some     more
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2021-02-05
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多