【问题标题】:Matching two data frames with some characters in R将两个数据框与 R 中的某些字符匹配
【发布时间】:2020-02-25 20:41:49
【问题描述】:

我有以下数据框

df1 <- data.frame(
    Description=c("How are you- doing?",	"will do it tomorrow otherwise: next week",	"I will work hard to complete it for nextr week1 or  tomorrow",	"I am HAPPY with this situation now","Utilising this approach can helpα'x-ray",	"We need to use interseting <U+0452> books to solve the issue",	"Not sure if we could do it appropriately.",	"The schools and Universities are closed in f -blook for a week", 	"Things are hectic here and we are busy"))

   

<!-- begin snippet: js hide: false console: true babel: false -->

我想得到下表:

d <- data.frame(
    Description=c("Utilising this approach can helpa'x-ray",	"How are you- doing",	" We need to use interseting <U+0452> books to solve the issue ",	" will do it tomorrow otherwise: next week ",	" Things are hectic here and we are busy ",	"I will work hard to complete it for nextr week1 or  tomorrow ",	"The schools and Universities are closed in f -blook for a week", 	" I am HAPPY with this situation now "," I will work hard to complete it for nextr week1 or  tomorrow"))
    f2<- read.table(text="B12	B6	B9
No	Yes	Yes
12	6	9
No	No	Yes
No	No	Yes
No	No	Yes
Yes	No	Yes
11	No	Yes
12	11	P
No	No	Yes

", header=TRUE)

df3<-cbind(d,f2)

正如您在描述列中看到的,有空格和冒号,等等 1 之后的一周是下标,我无法修复它。我想根据“描述”来匹配它。所以我想使用描述将 df1 与 df2 匹配。对于这种情况,我们可以在 R 中做吗?

【问题讨论】:

  • 仅供参考,你拼错了read.table
  • 我无法运行您的示例代码(即使在修复拼写错误之后)
  • 所以你想将df2 的第一列与df1 的第一列交换?像 df2[, 1]
  • 您可以使用dput 将您的数据作为 R 代码获取。

标签: r


【解决方案1】:

我们可以使用来自fuzzyjoin 包的stringdist 连接来匹配基于'Description' 的数据。我们使用na.omit 从最终数据帧中删除NA 行。

na.omit(fuzzyjoin::stringdist_left_join(df1, df3, by = 'Description'))

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2021-10-18
    • 1970-01-01
    • 1970-01-01
    • 2016-04-15
    • 2022-01-17
    • 1970-01-01
    • 2013-07-27
    • 1970-01-01
    相关资源
    最近更新 更多