【发布时间】:2016-03-03 15:22:21
【问题描述】:
我使用police_officer <- str_extract_all(txtparts, "ID:.*\n") 从文本文件中提取了参与 911 呼叫的所有警察的姓名。
例如:
2237 DISTURBANCE Report taken
Call Taker: Telephone Operators Sharon L Moran
Location/Address: [BRO 6949] 61 WILSON ST
ID: Patrolman Darvin Anderson
Disp-22:43:39 Arvd-22:48:57 Clrd-23:49:45
ID: Patrolman Stephen T Pina
Disp-22:43:48 Clrd-22:46:10
ID: Sergeant Michael V Damiano
Disp-22:46:33 Arvd-22:47:14 Clrd-22:55:22
在某些部分匹配多个ID: 时,我得到:"c(\" Patrolman Darvin Anderson\\n\", \" Patrolman Stephen T Pina\\n\", \" Sergeant Michael V Damiano\\n\")"。
以下是我迄今为止尝试清理数据的方法:police_officer <- str_replace_all(police_officer,"c\\(.","")
police_officer <- str_replace_all(police_officer,"\\)","")
police_officer <- str_replace_all(police_officer,"ID:","")
police_officer <- str_replace_all(police_officer,"\\n\","") # I can't get rid of\\n\.
这就是我最终得到的结果" Patrolman Darvin Anderson\\n\", \" Patrolman Stephen T Pina\\n\", \" Sergeant Michael V Damiano\\n\""
我需要帮助清理\\n\。
【问题讨论】:
标签: regex r string substring stringr