【发布时间】:2015-09-18 19:29:21
【问题描述】:
所以我对这个问题有类似的问题: Remove duplicate rows in R
就我而言,我想保留所有列(不像建议在前 3 列上使用 unique 函数)。如果提到的两个列中的“值”相同,我想只考虑数据框中的 2 列,并且只保留 1 行。
数据如下:
structure(list(P1 = structure(c(1L, 1L, 3L, 3L, 5L, 5L, 5L, 5L,
4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 2L, 2L), .Label = c("Apple",
"Grape", "Orange", "Peach", "Tomato"), class = "factor"), P2 = structure(c(4L,
4L, 3L, 3L, 5L, 5L, 5L, 5L, 6L, 6L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 6L, 6L), .Label = c("Banana", "Cucumber", "Lemon", "Orange",
"Potato", "Tomato"), class = "factor"), P1_location_subacon = structure(c(NA,
NA, 1L, 1L, 1L, 1L, 1L, 1L, NA, NA, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), .Label = c("Fridge", "Table"), class = "factor"),
P1_location_all_predictors = structure(c(2L, 2L, 3L, 3L,
3L, 3L, 3L, 3L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L), .Label = c("Table,Desk,Bag,Fridge,Bed,Shelf,Chair",
"Table,Shelf,Cupboard,Bed,Fridge", "Table,Shelf,Fridge"), class = "factor"),
P2_location_subacon = structure(c(1L, 1L, 1L, 1L, NA, NA,
NA, NA, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Fridge",
"Shelf"), class = "factor"), P2_location_all_predictors = structure(c(3L,
3L, 2L, 2L, 1L, 1L, 1L, 1L, 3L, 3L, 2L, 2L, 2L, 2L, 3L, 3L,
3L, 3L, 3L, 3L), .Label = c("Shelf,Fridge", "Shelf,Fridge,Bed",
"Table,Shelf,Fridge"), class = "factor")), .Names = c("P1",
"P2", "P1_location_subacon", "P1_location_all_predictors", "P2_location_subacon",
"P2_location_all_predictors"), row.names = c(NA, -20L), class = "data.frame")
对我来说重要的列是:P1 和 P2。我想只保留其中一排我们可以吃同样的水果/蔬菜。 (请记住,两列中的水果/蔬菜必须相同):
例子:
之前:
P1 P2 P1_location_subacon P1_location_all_predictors P2_location_subacon P2_location_all_predictors
1 Apple Orange <NA> Table,Shelf,Cupboard,Bed,Fridge Fridge Table,Shelf,Fridge
2 Apple Orange <NA> Table,Shelf,Cupboard,Bed,Fridge Fridge Table,Shelf,Fridge
3 Orange Lemon Fridge Table,Shelf,Fridge Fridge Shelf,Fridge,Bed
4 Orange Lemon Fridge Table,Shelf,Fridge Fridge Shelf,Fridge,Bed
5 Tomato Potato Fridge Table,Shelf,Fridge <NA> Shelf,Fridge
6 Tomato Potato Fridge Table,Shelf,Fridge <NA> Shelf,Fridge
7 Tomato Potato Fridge Table,Shelf,Fridge <NA> Shelf,Fridge
8 Tomato Potato Fridge Table,Shelf,Fridge <NA> Shelf,Fridge
之后:
P1 P2 P1_location_subacon P1_location_all_predictors P2_location_subacon P2_location_all_predictors
1 Apple Orange <NA> Table,Shelf,Cupboard,Bed,Fridge Fridge Table,Shelf,Fridge
4 Orange Lemon Fridge Table,Shelf,Fridge Fridge Shelf,Fridge,Bed
5 Tomato Potato Fridge Table,Shelf,Fridge <NA> Shelf,Fridge
它会保留哪一行并不重要。那可以随机选择。
【问题讨论】:
标签: r