【问题标题】:How to get duplicate rows from table in R [duplicate]如何从R中的表中获取重复行[重复]
【发布时间】:2016-03-03 22:45:50
【问题描述】:
Name Address Account    a   b      Amount   Phone
John CA     4879759  qwqe   rerter  203     807789747
Nil  FD     1234455  iuyui  jhgjhg  4321    98797897
Was  FR     8979696  yikjh  kkjhk   45989   9899999
Nil  FD     1234455  iuyui  jhgjhg  4321    98797897
John CA     4879759  qwqe   rerter  203     807789747
Saw  PO     9873279  kjljl  bjhjh   765     3543656
Nil  FD     1234455  iuyui  jhgjhg  4321    98797897
Aws  IL     707009   dfdsf  sasd    2344    242545
John CA     4879759  qwqe   rerter  203     807789747

我想借助 R 代码从该表中提取重复的行。表名是“贷款”。我有 170 亿个订单项。主键列“姓名、地址、帐户、金额、电话”。 伙计们,我期待得到一些积极的解决方案。

分离之后的另一件事是,我想以 .csv 格式保存该重复的数据表。我是 R 新手,请帮助我。

【问题讨论】:

标签: r


【解决方案1】:

我们可以使用duplicated 根据键列('nm1')获取所有重复行。

nm1 <- c("Name", "Address", "Account", "Amount", "Phone") 
df1[duplicated(df1[nm1])|duplicated(df1[nm1], fromLast=TRUE),]
# Name Address Account     a      b Amount     Phone
#1 John      CA 4879759  qwqe rerter    203 807789747
#2  Nil      FD 1234455 iuyui jhgjhg   4321  98797897
#4  Nil      FD 1234455 iuyui jhgjhg   4321  98797897
#5 John      CA 4879759  qwqe rerter    203 807789747
#7  Nil      FD 1234455 iuyui jhgjhg   4321  98797897
#9 John      CA 4879759  qwqe rerter    203 807789747

【讨论】:

  • 非常感谢阿克伦.....
【解决方案2】:

对 Akrun 答案的扩展,仅在重复检查中包含键列:

mainCols = c("Name", "Address", "Account", "Amount", "Phone")
duplicatedRows = duplicated(loan[,mainCols])
duplicatedData = loan[duplicatedRows,]

# Name Address Account     a      b Amount     Phone
# 4  Nil      FD 1234455 iuyui jhgjhg   4321  98797897
# 5 John      CA 4879759  qwqe rerter    203 807789747
# 7  Nil      FD 1234455 iuyui jhgjhg   4321  98797897
# 9 John      CA 4879759  qwqe rerter    203 807789747

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2021-05-22
    • 1970-01-01
    • 1970-01-01
    • 2017-07-12
    • 2015-11-03
    • 2015-10-02
    • 2019-12-30
    • 1970-01-01
    相关资源
    最近更新 更多