使用gsub和mapply从另一个不同长度的单词向量中删除一个单词向量[重复]答案

【问题标题】：Using gsub and mapply to remove a vector of words from another vector of words of different lengths [duplicate]使用gsub和mapply从另一个不同长度的单词向量中删除一个单词向量[重复]
【发布时间】：2020-07-10 01:10:20
【问题描述】：

我有一个词向量，我想从另一个词向量中删除。我正在使用 mapply 和 gsub，但收到错误“更长的参数不是更短长度的倍数”。

    sw_column <- c(stop_words$word)
head(sw_column)
[1] "a"         "a's"       "able"      "about"     "above"     "according"


x <- c(amplification.words, deamplification.words, negation.words)
head(x)
[1] "acute"      "acutely"    "certain"    "certainly"  "colossal"   "colossally"


stop_words_clean <- mapply(gsub, x, "", sw_column)
error message: longer argument not a multiple of length of shorter

我希望从 sw_column 中删除 x 中的所有单词。注意：不是x中的所有单词都出现在sw_column中

【问题讨论】：

您需要sw_column[sw_column %in% x] <- '' 吗？
也许：stop_words_clean <- setdiff(sw_column, x)。很难知道您的预期输出是什么样的。

标签： r gsub mapply

【解决方案1】：

如果你想将一个文本向量过滤成另一个你可以使用下面的代码，我用一些虚构的向量来解释自己。

stop_words_example <- c("a", "a's", "able", "about", "above", "according")
x <- c("a", "a's", "able", "about", "above", "according", "acute", "acutely", "certain", "certainly", "colossal", "colossally")

x[!x %in% stop_words_example]

[1] "acute"      "acutely"    "certain"    "certainly"  "colossal"   "colossally"

【讨论】：

【解决方案2】：

只是猜测，但setdiff(x, y) 返回“x”（第一个参数）中不在“y”（第二个参数）中的元素。所以，

stop_words_clean <- setdiff(sw_column, x)

可能就是你所追求的。

例子：

sw_column <- c("a", "a's","able","about", "above","according")
x <- c("a", "able", "above")

setdiff(sw_column, x)
#[1] "a's"       "about"     "according"

至于gsub，该函数修改字符向量的元素，这不是您声明的目标。

【讨论】：