用于搜索和替换文件中的字符串的正则表达式答案

【问题标题】：Regular expression to search and replace a string in a file用于搜索和替换文件中的字符串的正则表达式
【发布时间】：2014-05-21 11:57:03
【问题描述】：

嗨朋友们，我正在尝试在文件列表中搜索特定的关键字（以 txt 格式给出）。我正在使用正则表达式来检测和替换文件中关键字的出现。下面是一个逗号分隔的关键字，我传递给它进行搜索。

library(stringi)
txt <- "automatically got activated,may be we download,network services,food quality is excellent"

应搜索“自动激活”并将其替换为 automatic_got_activated...“可能是我们下载”替换为“may_be_we_download”等等。

txt <- "automatically got activated,may be we download,network services,food quality is excellent"

for(i in 1:length(txt)) {
    start <- head(strsplit(txt, split=" ")[[i]], 1) #finding the first word of the keyword 
    n <- stri_stats_latex(txt[i])[4]        #number of words in the keyword

    o <- tolower(regmatches(text, regexpr(paste0(start,"(?:[^a-zA-Z'-]+[a-zA-Z'-]+){0,",
        n-1,"}"),text,ignore.case=TRUE)))   #best match for keyword for the regex in the file 

    p <- which(!is.na(pmatch(txt, o)))      #exact match for the keywords
}

【问题讨论】：

这个问题可能需要清理一下。您对问题的标题和描述不同。这也太大了；太多信息无法复制问题。试着把数据删减一点，让人们可以轻松地读入 R..
10 个这样的问题，SO 数据库将被关闭....
请考虑这样做以减少您的问题的大小：stackoverflow.com/questions/5963269/…
对不起，伙计们，我是新手..:(..感谢您分享这些链接
@OnkarK 我认为您需要更具体一点。我认为您正在尝试使用您编写的代码向我们展示您想要的东西，但该代码无法按您的预期工作。即使是写得很好的代码也很难理解一个问题。我建议你真正定义你所追求的（甚至可能是一个规则列表）。然后实际向我们展示您期望代码返回的示例输出。这是一个我无法仅用文字描述问题的示例，因此我也给出了所需的输出：stackoverflow.com/questions/22235288/…

标签： regex string r stringi

【解决方案1】：

我想这可能就是你要找的。p>

> txt <- "automatically got activated,may be we download,network services,food quality is excellent"

要搜索的句子组成向量：

> searchList <- c('This is a sentence that automatically got activated',
                  'may be we download some music tonight',
                  'I work in network services',
                  'food quality is excellent every time I go',
                  'New service entrance',
                  'full quantity is excellent')

完成这项工作的函数：

replace.keyword <- function(text, toSearch)
{
    kw <- unlist(strsplit(txt, ','))
    gs <- gsub('\\s', '_', kw)
    sapply(seq(kw), function(i){
      ul <- ifelse(grepl(kw[i], toSearch),
                   gsub(kw[i], gs[i], toSearch),
                   "")
      ul[nzchar(ul)]
    })
}

结果：

> replace.keyword(txt, searchList)
# [1] "This is a sentence that automatically_got_activated"
# [2] "may_be_we_download some music tonight"              
# [3] "I work in network_services"                         
# [4] "food_quality_is_excellent every time I go"

让我知道它是否适合你。

【讨论】：

非常感谢。这正是我想要的。它是简单的查找和替换文本（FART）功能。我的坏..
只是有点怀疑预期的关键字被替换在（txt）而不是（searchList）中。第二个疑问不满足关键字的句子应该保持在（searchList）中。