【问题标题】:R create duplicate columns (bulk)R创建重复列(批量)
【发布时间】:2021-12-05 08:18:04
【问题描述】:

如何根据 colname 模式创建多个重复列? 我的实际数据框非常大,所以我需要一个循环或其他一些高效的代码。

df <- data.frame(x001 = c(1,2,3), x002 = c(3,4,5), x003 = c(6,7,8), soccer = c(700,600,300), volley = c(30,22,29))

df
#current df
#   x001 x002 x003 soccer volley
#   1    3    6    700     30
#   2    4    7    600     22
#   3    5    8    300     29

#desired output: all x00 columns need to be duplicated and get a "no2" addition to their name. 
#   x001 x002 x003 soccer volley  x001no2 x002no2 x003no2
#   1    3    6    700     30       1        3    6  
#   2    4    7    600     22       2        4    7 
#   3    5    8    300     29       3        5    8 

【问题讨论】:

    标签: r loops dplyr duplicates tidyverse


    【解决方案1】:

    简单

    tmp=grep("x00",colnames(df))
    
    cbind(
      df,
      setNames(df[,tmp],paste0(colnames(df)[tmp],"no2"))
    )
    
      x001 x002 x003 soccer volley x001no2 x002no2 x003no2
    1    1    3    6    700     30       1       3       6
    2    2    4    7    600     22       2       4       7
    3    3    5    8    300     29       3       5       8
    

    【讨论】:

      【解决方案2】:

      使用 dplyr 和 stringr:

      library(dplyr)
      library(stringr)
      
      df %>% bind_cols(df %>% select(starts_with('x00')) %>% rename_all( ~ str_c(., 'no2')) )
        x001 x002 x003 soccer volley x001no2 x002no2 x003no2
      1    1    3    6    700     30       1       3       6
      2    2    4    7    600     22       2       4       7
      3    3    5    8    300     29       3       5       8
      

      【讨论】:

        【解决方案3】:

        Data.table 解决方案:

        # Import and initialise the data.table package: 
        library(data.table)
        
        # Resolve the column names desired to be replicated: 
        # df_names => character vectors
        df_names <- grep(
          "^x\\d+",
          colnames(df),
          value = TRUE
        )
        
        # Replicate the desired vectors: res => data.table
        res <- setDT(df)[, paste0(
          df_names,
          "no2"
          ) := df[
            , .SD,
            .SDcols = df_names
          ],
        ]
        

        Tidyverse 解决方案:

        # Import and initialise the tidyverse package: 
        library(tidyverse)
        
        # Replicate the desired column vectors: res => data.frame
        res <- df %>% 
          bind_cols(
            .,
            df %>% 
              select_if(
                str_detect(
                  colnames(.),
                  "^x\\d+"
                )
              ) %>% 
              set_names(
                .,
                str_c(
                  colnames(.),
                  "no2"
              )
            )
          )
        

        【讨论】:

          猜你喜欢
          • 2023-03-21
          • 1970-01-01
          • 2021-09-05
          • 2013-06-23
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          相关资源
          最近更新 更多