如何根据 2 个聚合列从 R 中的另一列中减去 1 列答案

【问题标题】：How can I subtract 1 column from another in R based on 2 aggregate columns如何根据 2 个聚合列从 R 中的另一列中减去 1 列
【发布时间】：2021-01-21 16:00:39
【问题描述】：

First Data frame 'total_coming_in' column names: 'LocationID','PartNumber',"Quantity"

Second Data frame 'total_going_out' column names: 'LocationID','PartNumber',"Quantity"

我希望输出为“total_data”列名：“LocationID”、“PartNumber”、“Quantity_subtract”，其中 Quantity_subtract = total_coming_in$Quantity - total_going_out$Quantity 为每个“LocationID”、“PartNumber”分组我试过这个：-

matchingCols <- c('LocationID','PartNumber')
mergingCols <- names(coming_in)[3]
total_coming_in[total_going_out,on=matchingCols, 
                                lapply(
                                  setNames(mergingCols),
                                  function(x) get(x) - get(paste0("i.", x))
                                ),
      nomatch=0L,
      by=.EACHI
      ]

【问题讨论】：

标签： r database dataframe data-manipulation

【解决方案1】：

按照您的意愿使用data.table，我将首先干净地合并两个表，然后仅对有意义的行执行减法操作（即total_coming_in 中的行在@ 中具有匹配的值值987654323@，反之亦然）：

library(data.table)
M <- merge(total_coming_in, total_going_out, by = c('LocationID','PartNumber'))   
# i.e. all.x = FALSE, all.y = FALSE, 
# thereby eliminating rows in x without matching row in y and vice-versa  
M[ , Quantity_subtract := Quantity.x - Quantity.y, 
     by = c('LocationID','PartNumber')]

现在为了完整性，因为您的问题可能被解释为允许 total_going_out 中 Quantity.y 的 0 值用于 total_coming_in 的行，而 total_going_out 中没有匹配值，反之亦然，您可以这样做案例：

 M <- merge(total_coming_in, total_going_out, all = TRUE, by = c('LocationID','PartNumber'))   
# i.e. all.x = TRUE, all.y = TRUE, 
# thereby allowing rows in x without matching row in y and vice-versa    

M[is.na(Quantity.x), Quantity.x := 0]
M[is.na(Quantity.y), Quantity.y := 0]
M[ , Quantity_subtract := Quantity.x - Quantity.y, 
     by = c('LocationID','PartNumber')]

【讨论】：

【解决方案2】：

因此，您希望有一列可以为PartNumber 和LocationID 的每个组合提供total_coming_in 和total_going_out 的差异，对吗？

如果是这样，以下将做：

library(dplyr)
matchingCols <- c("LocationID", "PartNumber")
total_data <- full_join(total_coming_in, total_going_out, by=matchingCols)
total_data <- mutate(total_data, Quantity_subtract = Quantity.x - Quantity.y)
total_data <- select(total_data, -Quantity.x, -Quantity.y) #if you want to get rid of these columns

我使用了这个示例数据：

total_coming_in <- list(LocationID = round(runif(26, 1000, 9000)),
                        PartNumber = paste(runif(26, 10000, 20000), LETTERS, sep="-"),
                        Quantity = round(runif(26, 2, 4))
                        ) %>% as_tibble()
random_integers <- sample(1:26,26,FALSE)
total_going_out <- list(LocationID = total_coming_in$LocationID[random_integers],
                        PartNumber = total_coming_in$PartNumber[random_integers],
                        Quantity = round(runif(26, 1, 3))
                        ) %>% as_tibble()

【讨论】：