【问题标题】:Subset a dataframe with non-existing column names使用不存在的列名子集数据框
【发布时间】:2017-11-01 09:49:42
【问题描述】:

我在 R 中有这行代码:

newDF <-oldDF[subsettingColumns]

subsettingColumns 有一些列名可能存在也可能不存在于oldDF 中。如果它不存在,我希望在newDFNA 的同一位置插入一列。如何让它在 R 中工作?

【问题讨论】:

  • 我不认为这是链接问题的重复。

标签: r dataframe


【解决方案1】:

举个例子:

df <- data.frame(a = c(1, 2, 3, 4), b = c(4, 5, 6, 7))
df

#  a b
#1 1 4
#2 2 5
#3 3 6
#4 4 7

#Columns to take subset of
subsettingColumns <- c('a', 'd', 'e')

#Columns which are already present
cols <- subsettingColumns[subsettingColumns %in% names(df)]

#Add them in the new dataframe
newdf <- df[cols]

#Assign NA to the columns which are not defined in the original dataframe
newdf[setdiff(subsettingColumns, cols)] <- NA

newdf
#  a  d  e
#1 1 NA NA
#2 2 NA NA
#3 3 NA NA
#4 4 NA NA

【讨论】:

    【解决方案2】:

    如果数据框中不存在列,您可以使用function 来添加列,如下所示:

    AddColumn <- function(oldDF, subsettingColumns) {
    
      addCol <-subsettingColumns[!subsettingColumns%in%names(oldDF)]
    
      if(length(addCol)!=0) oldDF[addCol] <- NA
        oldDF
    }
    

    在示例数据上测试此函数:

    # Example data
    oldDF <- data.frame(A = c(1, 2, 3, 4, 5), B = c(11, 12, 13, 14, 15))
    
    AddColumn(oldDF, "testColumn")
    
    #   A   B  testColumn
    #1  1  11         NA
    #2  2  12         NA
    #3  3  13         NA
    #4  4  14         NA
    #5  5  15         NA
    
    AddColumn(oldDF, c("testColumn1", "testColumn2")
    
    #  A   B  testColumn1  testColumn2
    #1 1  11           NA          NA
    #2 2  12           NA          NA
    #3 3  13           NA          NA
    #4 4  14           NA          NA
    #5 5  15           NA          NA
    

    【讨论】:

      【解决方案3】:

      数据

      oldDF <- mtcars
      subsettingColumns <- c("am","IDontExist","gear","IAlsoDontExist")
      

      获取未知列

      unknownCol <- setdiff(subsettingColumns,intersect(names(mtcars),subsettingColumns))
      
      tempDF <- lapply(unknownCol,function(x){df=data.frame(A=NA);names(df)=x;df})
      oldDF <- Reduce(cbind,c(list(oldDF),tempDF))
      
      newDF <- oldDF[subsettingColumns]
      newDF
      

      结果

      > head(newDF)
                        am IDontExist gear IAlsoDontExist
      Mazda RX4          1         NA    4             NA
      Mazda RX4 Wag      1         NA    4             NA
      Datsun 710         1         NA    4             NA
      Hornet 4 Drive     0         NA    3             NA
      Hornet Sportabout  0         NA    3             NA
      Valiant            0         NA    3             NA
      > 
      

      【讨论】:

        【解决方案4】:

        基于 Andre Elrico 回答的基本结构,您可以执行以下操作:

        newDf <- data.frame(sapply(subsettingColumns,
                                   function(x) if(x %in% names(oldDF)) oldDF[[x]] else NA))
        

        前6行是

        head(newDf)
          am IDontExist gear IAlsoDontExist
        1  1         NA    4             NA
        2  1         NA    4             NA
        3  1         NA    4             NA
        4  0         NA    3             NA
        5  0         NA    3             NA
        6  0         NA    3             NA
        

        数据

        oldDF <- mtcars
        subsettingColumns <- c("am","IDontExist","gear","IAlsoDontExist")
        

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 2014-07-02
          • 2017-03-18
          • 1970-01-01
          • 2020-05-27
          • 1970-01-01
          • 2016-06-18
          • 2018-09-01
          • 2020-11-21
          相关资源
          最近更新 更多