【问题标题】:How to string splitting in R? [duplicate]如何在R中进行字符串拆分? [复制]
【发布时间】:2018-05-28 11:00:41
【问题描述】:

我有一个这样的数据框:

Screen.name     party                             users
1  A_Gloeckner   SPD                          @MartinSchulz. 
2  A_Gloeckner   SPD                           @MartinSchulz 
3 A_Gloeckner   SPD  @ManuelaSchwesig @sigmargabriel @nahles
4  a_grotheer   SPD                           @SouthendRNLI 
5  a_grotheer   SPD                           @ribasdiego10 
6  a_grotheer   SPD                        @HBBuergerschaft 
7  a_grotheer   SPD                             @UniBremen… 

我想拆分第 3 列并使数据框看起来像这样:

Screen.name party                          mentioned_users
1  A_Gloeckner   SPD                          @MartinSchulz. 
2  A_Gloeckner   SPD                           @MartinSchulz 
3  A_Gloeckner   SPD                        @ManuelaSchwesig 
4 A_Gloeckner   SPD                          @sigmargabriel 
5 A_Gloeckner   SPD                             @nahles
6  a_grotheer   SPD                           @SouthendRNLI 
7  a_grotheer   SPD                           @ribasdiego10 
8  a_grotheer   SPD                        @HBBuergerschaft 
9 a_grotheer   SPD                             @UniBremen… 

到目前为止,我已经尝试过这个: mention_polits_2017=mention_polits_2017[,list(mention_polits_2017=unlist(strsplit(mention_polits_2017,","))),by=mention_polits_2017$Screen.name]

但它向我显示了一个错误,“[.data.frame(mention_polits_2017, , list(mention_polits_2017 = unlist(strsplit(mention_polits_2017, : 未使用的参数(by =mention_polits_2017$Screen.name)”

谢谢。

【问题讨论】:

  • 到目前为止你尝试了什么?
  • mention_polits_2017=mention_polits_2017[,list(mention_polits_2017=unlist(strsplit(mention_polits_2017,","))),by=mention_polits_2017$Screen.name]
  • @您能否用您所做的以及可能的输出来更新问题?这将对每个人都有帮助。
  • 我之前调查过这个问题的答案。我已经尝试了几乎所有的替代方案。但是我的数据框本身有一些问题。我要拆分的字符串格式很奇怪,我猜是分隔符问题。例如,第三行元素显示的内容: strsplit(mention_polits_2017[3,3], " ") [[1]] [1] "" "@ManuelaSchwesig" "" "@sigmargabriel" "" "@nahles "

标签: r string strsplit


【解决方案1】:

你可以试试

library(tidyverse)
df %>% 
 separate_rows(users, sep=" ")
  Screen.name party            users
1 A_Gloeckner   SPD   @MartinSchulz.
2 A_Gloeckner   SPD    @MartinSchulz
3 A_Gloeckner   SPD @ManuelaSchwesig
4 A_Gloeckner   SPD   @sigmargabriel
5 A_Gloeckner   SPD          @nahles
6  a_grotheer   SPD    @SouthendRNLI
7  a_grotheer   SPD    @ribasdiego10
8  a_grotheer   SPD @HBBuergerschaft
9  a_grotheer   SPD       @UniBremen

数据

df <- structure(list(Screen.name = structure(c(1L, 1L, 1L, 2L, 2L, 
                                               2L, 2L), .Label = c("A_Gloeckner", "a_grotheer"), class = "factor"), 
                     party = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "SPD", class = "factor"), 
                     users = c("@MartinSchulz.", "@MartinSchulz", "@ManuelaSchwesig @sigmargabriel @nahles", 
                               "@SouthendRNLI", "@ribasdiego10", "@HBBuergerschaft", "@UniBremen"
                     )), class = "data.frame", .Names = c("Screen.name", "party", 
                                                          "users"), row.names = c(NA, -7L))

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2012-11-08
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多