【问题标题】:update multiple rows based on conditions根据条件更新多行
【发布时间】:2021-12-03 11:07:45
【问题描述】:

我想根据一列同时更新三列

我的数据是这样的

df <- data.frame(input = c("Antidesma cuspidatum Mull.Arg.", "Antidesma cuspidatum Müll.Arg.", 
                  "Alchornea parviflora (Benth.) Mull.Arg.", "Alchornea parviflora (Benth.) Müll.Arg."),
                 n1 = c("Antidesma cuspidatum", NA, "Alchornea parviflora", NA),
                 n2 = c("Antidesma", NA, "Alchornea", NA),
                 n3 = c("Phyllanthaceae", NA, "Euphorbiaceae", NA))

                                    input                   n1        n2             n3
1          Antidesma cuspidatum Mull.Arg. Antidesma cuspidatum Antidesma Phyllanthaceae
2          Antidesma cuspidatum Müll.Arg.                 <NA>      <NA>           <NA>
3 Alchornea parviflora (Benth.) Mull.Arg. Alchornea parviflora Alchornea  Euphorbiaceae
4 Alchornea parviflora (Benth.) Müll.Arg.                 <NA>      <NA>           <NA>

我想问一下,如果我发现input 列的前两个strings 相同,那么对应的行是否相同。这意味着本例中n1n2n3 的值(第 2 行和第 4 行)将与值(第 1 行和第 3 行)相加。

我想要的输出在这里

                                    input                   n1        n2             n3
1          Antidesma cuspidatum Mull.Arg. Antidesma cuspidatum Antidesma Phyllanthaceae
2          Antidesma cuspidatum Müll.Arg. Antidesma cuspidatum Antidesma Phyllanthaceae
3 Alchornea parviflora (Benth.) Mull.Arg. Alchornea parviflora Alchornea  Euphorbiaceae
4 Alchornea parviflora (Benth.) Müll.Arg. Alchornea parviflora Alchornea  Euphorbiaceae

这个案子对我有什么建议吗?

【问题讨论】:

    标签: r regex dplyr tidyverse stringr


    【解决方案1】:

    基础 R 解决方案:

    # Resolve the names of column vectors prefixed with "n":
    # na_col_names => character vector
    na_col_names <- grep(
      "n\\d+",
      names(df),
      value = TRUE
    )
    
    # Carry the last value forward: df => data.frame
    df[,na_col_names] <- lapply(
      na_col_names,
      function(x){
        df[,x] <- na.omit(df[,x])[cumsum(!(is.na(df[,x])))]  
      }
    )
    

    Tidyverse:

    library(tidyverse)
    df %>% 
      mutate_if(
        str_detect("n\\d+", names(.)),
        function(x){
          fill(x, .direction = "down")
        }
      )
    

    【讨论】:

      【解决方案2】:

      您可以使用dplyr 包。 首先,我创建一个列gr,其中仅包含input 的前两个字符串。然后我更改(或mutate)列n1n2n3,将该组的非NA 值放在那里。

      library(dplyr)
      
      df %>%
        group_by(gr = gsub("(^\\w+ \\w+) .*", "\\1", input)) %>%
        mutate(across(c(n1, n2, n3), ~.x[!is.na(.x)][1])) %>%
        ungroup()
      

      【讨论】:

        猜你喜欢
        • 2018-11-13
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2021-11-25
        • 1970-01-01
        • 2019-07-24
        • 1970-01-01
        相关资源
        最近更新 更多