【问题标题】:How to split a cell on N number of words?如何在 N 个单词上拆分单元格?
【发布时间】:2018-03-22 10:12:15
【问题描述】:

我有一个数据框,其中有一列包含长文本,我想每 30 个单词拆分一次,在其他列中创建具有完全相同内容的必要新行。字符解决方案不起作用,因为我需要它来工作,这就是我发布这个不同问题的原因。

df1<-data_frame(V1=c(1, 2, 3), V2=c('Red', 'Blue', 'Red'), text=c('Folly words widow one downs few age every seven. If miss part by fact he park just shew. Discovered had get considered projection who favourable. Necessary up knowledge it tolerably. Unwilling departure education is be dashwoods or an. Use off agreeable law unwilling sir deficient curiosity instantly. Easy mind life fact with see has bore ten. Parish any chatty can elinor direct for former. Up as meant widow equal an share least', 'Bringing unlocked me an striking ye perceive. Mr by wound hours oh happy. Me in resolution pianoforte continuing we. Most my no spot felt by no. He he in forfeited furniture sweetness he arranging. Me tedious so to behaved written account ferrars moments. Too objection for elsewhere her preferred allowance her. Marianne shutters mr steepest to me. Up mr ignorant produced distance although is sociable blessing. Ham whom call all lain like.', 'Did shy say mention enabled through elderly improve. As at so believe account evening behaved hearted is. House is tiled we aware. It ye greatest removing concerns an overcame appetite. Manner result square father boy behind its his. Their above spoke match ye mr right oh as first. Be my depending to believing perfectly concealed household. Point could to built no hours smile sense.Breakfast agreeable incommode departure it an. By ignorant at on wondered relation. Enough at tastes really so cousin am of. Extensive therefore supported by extremity of contented. Is pursuit compact demesne invited elderly be. View him she roof tell her case has sigh. Moreover is possible he admitted sociable concerns. By in cold no less been sent hard hill.' ))

我尝试了以下方法:

df <- df1%>%
      mutate(text = strsplit(as.character(text), "\\W+{30}")) %>%
      unnest(text)

但它不起作用。

【问题讨论】:

标签: r dplyr strsplit


【解决方案1】:

这是separate_rowspaste 一起使用的选项

df1 %>%
   separate_rows(text) %>%
   group_by(V1) %>%
   group_by(V2, grp = ((row_number()-1) %/%30) + 1, add = TRUE) %>% 
   summarise(text = paste(text, collapse= ' ')) %>%
   ungroup %>%
   select(-grp)

【讨论】:

    【解决方案2】:

    试试这个,它对我有用。

    str_match_all(text, "(?:\\w+\\W*){30}")
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2013-04-20
      • 2020-07-24
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多