【问题标题】:Match and replace value using 2 Data Frames (R)使用 2 个数据框 (R) 匹配和替换值
【发布时间】:2020-05-04 01:24:39
【问题描述】:

2 dfs,需要将 "Name" 与 info$Name 匹配并替换 details$Salary 中的相应值,df - details 应保留所有值并且不应有 NA(如果找到匹配,则替换未找到的值离开为是的)

details<- data.frame(Name = c("Aks","Bob","Caty","David","Enya","Fredrick","Gaby","Hema","Isac","Jaby","Katy"),
                     Age = c(12,22,33,43,24,67,41,19,25,24,32),
                     Gender = c("f","m","m","f","m","f","m","f","m","m","m"),
                     Salary = c(1500,2000,3.6,8500,1.2,1400,2300,2.5,5.2,2000,1265))

info <- data.frame(Name = c("caty","Enya","Dadi","Enta","Billu","Viku","situ","Hema","Ignu","Isac"),
                income = c(2500,5600,3200,1522,2421,3121,4122,5211,1000,3500))   

预期结果:

Name      Age Gender Salary
Aks       12      f   1500
Bob       22      m   2000
Caty      33      m   2500
David     43      f   8500
Enya      24      m   5600
Fredrick  67      f   1400
Gaby      41      m   2300
Hema      19      f   5211
Isac      25      m   3500
Jaby      24      m   2000
Katy      32      m   1265     

以下都没有给出预期的结果

dplyr::left_join(details,info,by = "Name") 
dplyr::right_join(details,info,by = "Name") 
dplyr::inner_join(details,info, by ="Name") # for other matching and replace this works fine but not here
dplyr:: full_join(details,info,by ="Name")

所有结果都给出了 NA,也尝试使用 match 函数,但它没有给出想要的结果,任何帮助将不胜感激

【问题讨论】:

    标签: r replace match


    【解决方案1】:

    您在不同情况下的两个数据框中都有Name,我们需要先将它们放在相同的情况下,然后对它们进行left_join 并使用coalesce 选择@987654324 之间的第一个非NA 值@ 和 salary

    library(dplyr)
    
    details %>% mutate(Name = stringr::str_to_title(Name)) %>%
      left_join(info %>% mutate(Name = stringr::str_to_title(Name)), by = "Name") %>%
      mutate(Salary = coalesce(income, Salary)) %>%
      select(names(details))
    
    #       Name Age Gender Salary
    #1       Aks  12      f   1500
    #2       Bob  22      m   2000
    #3      Caty  33      m   2500
    #4     David  43      f   8500
    #5      Enya  24      m   5600
    #6  Fredrick  67      f   1400
    #7      Gaby  41      m   2300
    #8      Hema  19      f   5211
    #9      Isac  25      m   3500
    #10     Jaby  24      m   2000
    #11     Katy  32      m   1265
    

    【讨论】:

    • 只是一个后续问题,假设没有大写小写问题,那么它会工作吗? details %>% left_join(details,info,by ="Name") %>% mutate(Salary = coalesce(income, Salary)) %>% select(names(details)), 给出错误
    • @rajeshdhingra 您将 3 个数据帧传递给 left_join,使用 details %&gt;% left_join(info,by ="Name") %&gt;% mutate(Salary = coalesce(income, Salary)) %&gt;% select(names(details))
    【解决方案2】:

    基础 R 解决方案:

    
    matches <- match(tolower(details$Name), tolower(info$Name))
    match <-  !is.na(matches)
    
    details$Salary[match] <- info$income[matches[match]]
    
    #Result
    Name Age Gender Salary
    1       Aks  12      f   1500
    2       Bob  22      m   2000
    3      Caty  33      m   2500
    4     David  43      f   8500
    5      Enya  24      m   5600
    6  Fredrick  67      f   1400
    7      Gaby  41      m   2300
    8      Hema  19      f   5211
    9      Isac  25      m   3500
    10     Jaby  24      m   2000
    11     Katy  32      m   1265
    
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-03-23
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2017-01-06
      • 2021-07-04
      相关资源
      最近更新 更多