【发布时间】:2020-02-21 22:39:49
【问题描述】:
这是对这个问题的跟进:Concatenate previous and latter words to a word that match a condition in R
我正在寻找一个正则表达式,它在逗号之后的第二个空格处拆分字符串。看下面的例子:
vector <- c("Paulsen", "Kehr,", "Diego",
"Schalper", "Sepúlveda,", "Alejandro",
"Von Housen", "Kush,", "Terry")
X <- paste(vector, collapse = " ")
X
## this is the string I am looking to split:
"Paulsen Kehr, Diego Schalper Sepúlveda, Diego Von Housen Kush, Terry"
每个逗号后的第二个空格是我的regex 的标准。所以,我的输出将是:
"Paulsen Kehr, Diego"
"Schalper Sepúlveda, Alejandro"
"Von Housen Kush, Terry"
我想出了一个模式,但它不是很有效。
[^ ]+ [^ ]+, [^ ]+( )
将它与strsplit 一起使用会删除所有单词,而不是仅在第 1 组(即[^ ]+ [^ ]+, [^ ]+(group-1))处拆分。我想我只需要排除完整匹配,然后只匹配空格。 --
regex demo
strsplit(X, "[^ ]+ [^ ]+, [^ ]+( )")
# [1] "" [2] "" [3] "Von Housen Kush, Terry"
谁能想到一个regex 来查找每个逗号后的第二个空格?
【问题讨论】:
标签: regex regex r regex strsplit