【问题标题】:Apply family solution for multiple for loops?为多个 for 循环应用系列解决方案?
【发布时间】:2017-09-22 16:00:29
【问题描述】:

我有一个数据框,其中包含多个变量和一个字符串(a1, a2, ...an 变量),我正在尝试确定an 变量中的字符串是否出现在string 列中。每个an 变量都有一个对应的cn 变量。例如,如果a1 中的字符串出现在string 中,我希望c1 在其中包含Checked,等等。我为此开发了以下 for 循环解决方案(本文末尾的一些示例数据),但我想知道是否有一个应用系列解决方案可能更快更容易编码?在真实数据中,a和c变量有100多个。

#For loop solution

for (var in seq(2, 10, 2)){
  for (i in 1:nrow(df)){
    df[i, var]<-ifelse(grepl(df[i, var-1], df$string[i])=="TRUE", "Checked",  "Unchecked")
  }
}


#### Example data ####
a1<-c("zebra", "giraffe", "elephant")
a2<-c("hyena", "monkey", "antelope")
a3<-c("badger", "deer", "kangaroo")
a4<-c("tiger", "lion", "coyote")
a5<-c("penguin", "bear", "gorilla")

c1<-""
c2<-""
c3<-""
c4<-""
c5<-""

string<-c("elephant/bear/coyote/penguin/monkey",
          "giraffe/antelope/monkey/gorilla/tiger",
          "elephant/antelope/kangaroo/coyote/gorilla")

df<-cbind.data.frame(a1, c1, a2, c2, a3, c3, a4, c4, a5, c5, string, 
stringsAsFactors=F)

【问题讨论】:

    标签: r for-loop apply


    【解决方案1】:

    你可以这样做。可推广到任意数量的 a。

    require(dplyr) # For readability
    a<-cbind.data.frame(a1, a2, a3, a4, a5, stringsAsFactors=F)
    
    a %>%
      sapply(function(x) {mapply(function(a) grepl(a, string), x) %>% diag}) %>% 
                                         # Check for condition in above line
      ifelse("Checked", "Unchecked") %>% # Convert True and False to Checked and Unchecked
      data.frame %>%                     # Convert to data.frame
      setNames(paste0("c", 1:5))         # Setnames
    
             c1        c2        c3        c4        c5
    1 Unchecked Unchecked Unchecked Unchecked   Checked
    2   Checked   Checked Unchecked Unchecked Unchecked
    3   Checked   Checked   Checked   Checked   Checked
    

    在基础 R

    c_column = data.frame(ifelse(sapply(a, function(x) diag(mapply(function(a) grepl(a, string), x))), "Checked", "Unchecked"))
    names(c_column) = paste0("c", 1:5)
    

    【讨论】:

      【解决方案2】:

      这将满足您的要求,使用 sapply

      df[,2*(1:5)] <- t(sapply(1:nrow(df), 
                         function(i) sapply(2*(1:5)-1, 
                            function(j) c("Unchecked","Checked")[1+grepl(df[i,j], df$string[i])]
                        )))
      
      df
              a1        c1       a2        c2       a3        c3     a4        c4      a5        c5                                    string
      1    zebra Unchecked    hyena Unchecked   badger Unchecked  tiger Unchecked penguin   Checked       elephant/bear/coyote/penguin/monkey
      2  giraffe   Checked   monkey   Checked     deer Unchecked   lion Unchecked    bear Unchecked     giraffe/antelope/monkey/gorilla/tiger
      3 elephant   Checked antelope   Checked kangaroo   Checked coyote   Checked gorilla   Checked elephant/antelope/kangaroo/coyote/gorilla
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2021-03-15
        • 1970-01-01
        • 1970-01-01
        • 2022-01-15
        • 2017-04-27
        • 2013-12-27
        • 1970-01-01
        相关资源
        最近更新 更多