【问题标题】:Conditional VLOOKUP equivalent in R (dataframe)R(数据帧)中的条件 VLOOKUP 等效项
【发布时间】:2021-05-29 22:12:28
【问题描述】:

我在 R 中有 2 个不同的数据框

df1:

# V1 V2
1 200 300
2 201 301
3 202 302

df2:

# V1 V2 week
1 200 300 12-02-2018
2 301 201 25-05-2017
3 302 202 02-12-2016

我希望将它们与 VLOOKUP 等效项合并在一起。

这个想法是将 df2 的周添加到 df1 IF:

(df1$V1 == df2$V1 & df1$V2 == df2$V2) OR (df1$V1 == df2$V2 & df1$V2 == df2$V1)。

由于 V1 和 V2 是随机分配的,我需要条件是双向的。

有什么帮助吗?

非常感谢!

【问题讨论】:

  • 嗨,Daniel,如果您还添加了示例中所期望的输出,它将更加清晰。您对dplyrdata.table 的包裹满意吗?基于此,有很多方法可以解决。

标签: r dataframe merge vlookup


【解决方案1】:

对两个数据框中的V1V2 列进行排序,然后执行merge

df1 <- transform(df1, V1 = pmin(V1, V2), V2 = pmax(V1, V2))
df2 <- transform(df2, V1 = pmin(V1, V2), V2 = pmax(V1, V2))
merge(df1, df2, by = c('id', 'V1', 'V2'))

#  id  V1  V2       week
#1  1 200 300 12-02-2018
#2  2 201 301 25-05-2017
#3  3 202 302 02-12-2016

数据

df1 <- structure(list(id = 1:3, V1 = 200:202, V2 = 300:302), 
       row.names = c(NA, -3L), class = "data.frame")

df2 <- structure(list(id = 1:3, V1 = c(200L, 301L, 302L), V2 = c(300L, 
201L, 202L), week = c("12-02-2018", "25-05-2017", "02-12-2016"
)), row.names = c(NA, -3L), class = "data.frame")

【讨论】:

    【解决方案2】:

    您可以先在 V1 (df1) = V1 (df2) 和 V2 (df1) = V2 (df2) 上合并,然后从 df1 中获取不满足这些条件的行。有了这些行,您现在可以再次与 V1 (df1) = V2 (df2) 和 V2 (df1) = V1 (df2) 合并,从而按照您所说的顺序模仿“OR”条件。

    #Replicates of your dataframes
    df1 <- data.frame(matrix(c(1, 200, 300, 
                               2, 201, 301, 
                               3, 202, 302), ncol=3, byrow = TRUE))
    colnames(df1) <- c("iddf1", "V1", "V2")
    
    df2 <- data.frame(matrix(c(1, 200, 300, "12-02-2018",
                               2, 301, 201, "25-05-2017",
                               3, 302, 202, "02-12-2016"), ncol=4, byrow = TRUE))
    colnames(df2) <- c("iddf2", "V1", "V2", "week")
    
    #Merge first on V1 (df1) = V2 (df2) and V2 (df1) = V2 (df2)
    df.merged.1 <- merge(df1, df2, by = c("V1", "V2"), all.x = T)
    
    #Extract the rows that did dot match
    df1.unmet <- df.merged.1[is.na(df.merged.1$iddf2),c("iddf1", "V1", "V2")]
    df.merged.1 <- df.merged.1[!is.na(df.merged.1$iddf2),]
    
    #Merge then on V2 (df1) = V1 (df2) and V2 (df1) = V1 (df2)
    df.merged.2 <- merge(df1.unmet, df2, by.x=c("V1", "V2"), by.y = c("V2", "V1"))
    
    #rbind the two dataframes to get the final result
    df.merged <- rbind(df.merged.1, df.merged.2)
    df.merged
    #   V1  V2 iddf1 iddf2       week
    #1 200 300     1     1 12-02-2018
    #2 201 301     2     2 25-05-2017
    #3 202 302     3     3 02-12-2016
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2020-08-14
      • 1970-01-01
      • 2013-02-09
      • 2019-03-14
      • 2019-02-24
      • 1970-01-01
      相关资源
      最近更新 更多