【发布时间】:2021-12-13 01:08:16
【问题描述】:
我在“残差”字段中有一个带有文本(地区)的数据框。我想搜索那个字段,如果找到一组关键词中的一个词,则重复该行,并将附加到该关键词的相应单词添加到字段“国家”
例如,如果在残差中发现单词“Alabama”,则该行被复制,并且单词“USA”从键集中与 Alabama 关联的“代码”中添加到国家/地区字段。
strsplit 沿列而不是行拆分,并且separate_rows 需要一个公共分隔符,所以我很快就超出了我的深度......
虚拟数据:
keys <- data.frame(key=c("Canada", "Alabama", "Maryland"), codes=c("CAN", "USA", "USA"))
df <- data.frame(residual=c("Canada, Alabama, Maryland line","Austria","Denver and Boulder","Alabama"),
country=c("America","Austria","America","America"),
otherfields=c("foo","foo","foo","foo"))
期望的输出:
result <- data.frame(residual=c("Canada, Alabama, Maryland line",
"Canada, Alabama, Maryland line",
"Canada, Alabama, Maryland line",
"Canada, Alabama, Maryland line",
"Austria",
"Denver and Boulder",
"Alabama",
"Alabama"
),
country=c("America",
"CAN",
"USA",
"USA",
"Austria",
"America",
"America",
"USA"),
otherfields=c("foo","foo","foo","foo","foo","foo","foo","foo"))
【问题讨论】:
标签: r split duplicates rows