【发布时间】:2019-07-23 17:16:09
【问题描述】:
我有一个大型文本数据框,我想在其中查找关键字。关键字还应用了类别。我需要帮助弄清楚如何附加 2 列,1 列带有找到的关键字,1 列带有关联的类别
我认为我有正确的代码来创建关键字列;但是,我不知道如何创建类别列。
#Generate sample data
text <- tibble(phrases = c("Hello my name is Bob", "I wasted time when I was that age", "What time is the party?"))
keys <- tibble(words = c("name","age","time"),categories = c("demographic","demographic","details"))
#Find keyword matches
text_match <- sapply(paste0(keys$words), grepl, text$phrases) %>%
as_tibble() %>%
mutate(Keywords = apply(., 1, function(x) paste(colnames(.)[x == 1], collapse = " | ")))
这会正确生成关键字列:
name age time Keywords
1 TRUE FALSE FALSE name
2 FALSE TRUE TRUE age | time
3 FALSE FALSE TRUE time
但是我怎样才能创建类别列。我想要这样的东西:
name age time Keywords Category
1 TRUE FALSE FALSE name demographic
2 FALSE TRUE TRUE age | time demographic | details
3 FALSE FALSE TRUE time details
【问题讨论】: