【问题标题】:Break string into individual fields in R [duplicate]将字符串分解为R中的各个字段[重复]
【发布时间】:2020-07-30 17:34:57
【问题描述】:

我有字符串

She was the youngest of the two daughters of a most affectionate

我想把它变成一个像下面这样的向量

shewastheyoungest

如果可能,我想使用 stringr。

谢谢。

【问题讨论】:

    标签: r nlp stringr


    【解决方案1】:

    以下任何一种都可以工作:

    scan(text=charv, what = character())
     [1] "She"          "was"          "the"          "youngest"     "of"           "the"         
     [7] "two"          "daughters"    "of"           "a"            "most"         "affectionate"
    

    unlist(strsplit(charv,' '))
    
     [1] "She"          "was"          "the"          "youngest"     "of"           "the"         
     [7] "two"          "daughters"    "of"           "a"            "most"         "affectionate"
    

    read.table(text=gsub(' ','\n',charv))
                 V1
    1           She
    2           was
    3           the
    4      youngest
    5            of
    6           the
    7           two
    8     daughters
    9            of
    10            a
    11         most
    12 affectionate
    

     unlist(regmatches(charv,gregexpr('\\w+',charv)))
     [1] "She"          "was"          "the"          "youngest"     "of"           "the"         
     [7] "two"          "daughters"    "of"           "a"            "most"         "affectionate"
    

    地点:

     charv<-'She was the youngest of the two daughters of a most affectionate'
    

    编辑:使用 stringr: 以下任意一项

    library(stringr)
    str_extract_all(charv, '\\w+')
    str_split(charv," ")
    

    【讨论】:

      【解决方案2】:

      试试这个:

      charv <- 'She was the youngest of the two daughters of a most affectionate'
      
      #Code
      x <- do.call(c,strsplit(charv,split = ' '))
      
      [1] "She"          "was"          "the"          "youngest"     "of"           "the"          "two"         
      [8] "daughters"    "of"           "a"            "most"         "affectionate"
      

      【讨论】:

      • 是的,我这样做了,然后我必须这样做 as.data.frame() 才能将其放入数据框。我试过str_remove(charv, " "),但很难访问里面的元素。谢谢
      • str_split() 也可用@DanielJachetta
      • @DanielJachetta 正如你所说的矢量。很抱歉造成混乱!
      • 为了更简洁,strsplit(charv,split = ' ')[[1]] 可以!
      猜你喜欢
      • 2018-09-22
      • 2012-09-11
      • 1970-01-01
      • 2014-10-13
      • 2012-04-07
      • 2012-03-31
      • 1970-01-01
      • 2014-06-24
      • 2017-06-08
      相关资源
      最近更新 更多