【问题标题】:R: How do you apply grep() in lapply()R:你如何在 lapply() 中应用 grep()
【发布时间】:2016-06-29 01:24:49
【问题描述】:

我想在 R 中应用 grep(),但我在 lapply() 中并不是很擅长。我知道 lapply 能够获取列表,将功能应用于每个成员并输出列表。例如,让x 是一个由 2 个成员组成的列表。

> x<-strsplit(docs$Text," ")
> 
> x
[[1]]
 [1] "I"         "lovehttp"  "my"        "mum."      "I"         "love"     
 [7] "my"        "dad."      "I"         "love"      "my"        "brothers."

[[2]]
 [1] "I"         "live"      "in"        "Eastcoast" "now."      "Job.I"    
 [7] "used"      "to"        "live"      "in"        "WestCoast."  

我想应用 grep() 函数来删除由 http 组成的单词。所以,我会申请:

> lapply(x,grep(pattern="http",invert=TRUE, value=TRUE))

但它不起作用,它说

Error in grep(pattern = "http", invert = TRUE, value = TRUE) : 
argument "x" is missing, with no default

所以,我试过了

> lapply(x,grep(pattern="http",invert=TRUE, value=TRUE,x))

但它说

Error in match.fun(FUN) : 
'grep(pattern = "http", invert = TRUE, value = TRUE, x)' is not a 
function, character or symbol

请帮忙,谢谢!

【问题讨论】:

  • 您需要将数据集传递给 grep 想要使用的地方。
  • @TimBiegeleisen 最初,我想删除由 http 组成的整个单词。因此,由于“lovehttp”由“http”组成,它将被删除。如果我只想删除“http”并保留“love”,可以吗?
  • 你需要弄清楚你想要什么答案。您现在正在更改要求。

标签: r lapply sapply tapply


【解决方案1】:

这可以在一行中完成:

lst <- lapply(lst, grep, pattern="http", value=TRUE, invert=TRUE)

#lst
#[[1]]
# [1] "I"         "my"        "mum."      "I"         "love"      "my"        "dad."      "I"         "love"      "my"        "brothers."
#
#[[2]]
# [1] "I"          "live"       "in"         "Eastcoast"  "now."       "Job.I"      "used"       "to"         "live"       "in"         "WestCoast."

如果您不想删除包含模式的整个单词并仅删除模式本身而保留单词的其余部分(如 cmets 中所述),您可以使用 gsub 而不是 grep

lapply(lst, gsub, pattern="http", replacement="")
#[[1]]
# [1] "I"         "love"      "my"        "mum."      "I"         "love"      "my"        "dad."      "I"         "love"      "my"        "brothers."
#
#[[2]]
# [1] "I"          "live"       "in"         "Eastcoast"  "now."       "Job.I"      "used"       "to"         "live"       "in"         "WestCoast."

【讨论】:

    【解决方案2】:

    以下代码行将删除列表中包含子字符串 http 的向量中的所有条目:

    repx <- function(x) {
        y <- grep("http", x)
        vec <- rep(TRUE, length(x))
        vec[y] <- FALSE
        x <- x[vec]
        return(x)
    }
    
    lapply(lst, function(x) { repx(x) })
    

    数据:

    x1 <- c("I", "lovehttp", "my", "mum.", "I", "love", "my", "dad.", "I", "love", "my", "brothers.")
    x2 <- c("I", "live", "in", "Eastcoast", "now.", "Job.I", "used", "to", "live", "in", "WestCoast.")
    lst <- list(x1, x2)
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-12-27
      • 1970-01-01
      • 2011-05-12
      • 2020-03-12
      • 1970-01-01
      • 2021-02-16
      相关资源
      最近更新 更多