将字符串分解为R中的各个字段[重复]答案

【问题标题】：Break string into individual fields in R [duplicate]将字符串分解为R中的各个字段[重复]
【发布时间】：2020-07-30 17:34:57
【问题描述】：

我有字符串

She was the youngest of the two daughters of a most affectionate

我想把它变成一个像下面这样的向量

shewastheyoungest等

如果可能，我想使用 stringr。

谢谢。

【问题讨论】：

标签： r nlp stringr

【解决方案1】：

以下任何一种都可以工作：

scan(text=charv, what = character())
 [1] "She"          "was"          "the"          "youngest"     "of"           "the"         
 [7] "two"          "daughters"    "of"           "a"            "most"         "affectionate"

或

unlist(strsplit(charv,' '))

 [1] "She"          "was"          "the"          "youngest"     "of"           "the"         
 [7] "two"          "daughters"    "of"           "a"            "most"         "affectionate"

或

read.table(text=gsub(' ','\n',charv))
             V1
1           She
2           was
3           the
4      youngest
5            of
6           the
7           two
8     daughters
9            of
10            a
11         most
12 affectionate

或

 unlist(regmatches(charv,gregexpr('\\w+',charv)))
 [1] "She"          "was"          "the"          "youngest"     "of"           "the"         
 [7] "two"          "daughters"    "of"           "a"            "most"         "affectionate"

地点：

 charv<-'She was the youngest of the two daughters of a most affectionate'

编辑：使用 stringr：以下任意一项

library(stringr)
str_extract_all(charv, '\\w+')
str_split(charv," ")

【讨论】：

【解决方案2】：

试试这个：

charv <- 'She was the youngest of the two daughters of a most affectionate'

#Code
x <- do.call(c,strsplit(charv,split = ' '))

[1] "She"          "was"          "the"          "youngest"     "of"           "the"          "two"         
[8] "daughters"    "of"           "a"            "most"         "affectionate"

【讨论】：

是的，我这样做了，然后我必须这样做 as.data.frame() 才能将其放入数据框。我试过str_remove(charv, " ")，但很难访问里面的元素。谢谢
str_split() 也可用@DanielJachetta
@DanielJachetta 正如你所说的矢量。很抱歉造成混乱！
为了更简洁，strsplit(charv,split = ' ')[[1]] 可以！