【问题标题】:How can I subtract 1 column from another in R based on 2 aggregate columns如何根据 2 个聚合列从 R 中的另一列中减去 1 列
【发布时间】:2021-01-21 16:00:39
【问题描述】:

First Data frame 'total_coming_in' column names: 'LocationID','PartNumber',"Quantity"

Second Data frame 'total_going_out' column names: 'LocationID','PartNumber',"Quantity"

我希望输出为“total_data”列名:“LocationID”、“PartNumber”、“Quantity_subtract”,其中 Quantity_subtract = total_coming_in$Quantity - total_going_out$Quantity 为每个“LocationID”、“PartNumber”分组 我试过这个:-

matchingCols <- c('LocationID','PartNumber')
mergingCols <- names(coming_in)[3]
total_coming_in[total_going_out,on=matchingCols, 
                                lapply(
                                  setNames(mergingCols),
                                  function(x) get(x) - get(paste0("i.", x))
                                ),
      nomatch=0L,
      by=.EACHI
      ]

【问题讨论】:

    标签: r database dataframe data-manipulation


    【解决方案1】:

    按照您的意愿使用data.table,我将首先干净地合并两个表,然后仅对有意义的行执行减法操作(即total_coming_in 中的行在@ 中具有匹配的值值987654323@,反之亦然):

    library(data.table)
    M <- merge(total_coming_in, total_going_out, by = c('LocationID','PartNumber'))   
    # i.e. all.x = FALSE, all.y = FALSE, 
    # thereby eliminating rows in x without matching row in y and vice-versa  
    M[ , Quantity_subtract := Quantity.x - Quantity.y, 
         by = c('LocationID','PartNumber')]
    

    现在为了完整性,因为您的问题可能被解释为允许 total_going_out 中 Quantity.y 的 0 值用于 total_coming_in 的行,而 total_going_out 中没有匹配值,反之亦然,您可以这样做案例:

     M <- merge(total_coming_in, total_going_out, all = TRUE, by = c('LocationID','PartNumber'))   
    # i.e. all.x = TRUE, all.y = TRUE, 
    # thereby allowing rows in x without matching row in y and vice-versa    
    
    M[is.na(Quantity.x), Quantity.x := 0]
    M[is.na(Quantity.y), Quantity.y := 0]
    M[ , Quantity_subtract := Quantity.x - Quantity.y, 
         by = c('LocationID','PartNumber')]
    

    【讨论】:

      【解决方案2】:

      因此,您希望有一列可以为PartNumberLocationID 的每个组合提供total_coming_intotal_going_out 的差异,对吗?

      如果是这样,以下将做:

      library(dplyr)
      matchingCols <- c("LocationID", "PartNumber")
      total_data <- full_join(total_coming_in, total_going_out, by=matchingCols)
      total_data <- mutate(total_data, Quantity_subtract = Quantity.x - Quantity.y)
      total_data <- select(total_data, -Quantity.x, -Quantity.y) #if you want to get rid of these columns
      

      我使用了这个示例数据:

      total_coming_in <- list(LocationID = round(runif(26, 1000, 9000)),
                              PartNumber = paste(runif(26, 10000, 20000), LETTERS, sep="-"),
                              Quantity = round(runif(26, 2, 4))
                              ) %>% as_tibble()
      random_integers <- sample(1:26,26,FALSE)
      total_going_out <- list(LocationID = total_coming_in$LocationID[random_integers],
                              PartNumber = total_coming_in$PartNumber[random_integers],
                              Quantity = round(runif(26, 1, 3))
                              ) %>% as_tibble()
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2021-09-10
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2021-02-22
        • 1970-01-01
        相关资源
        最近更新 更多