匹配所有数字时的r for循环答案

【问题标题】：r for loop when match all the numbers匹配所有数字时的r for循环
【发布时间】：2018-07-15 14:36:15
【问题描述】：

我有一个数据框，每行有 7 个数字，我想做一个 for 或 while 循环来告诉我一行何时与一行相同。

数据框：

   1st 2nd 3rd 4th 5th 6th 7th
1    5  32  34  38  39  49   8
2   10  20  21  33  40  44  34
3   10  20  26  28  35  48  13
4   14  19  23  36  44  46   7
5    9  24  25  27  36  38  41
6    7  13  14  20  29  32  28
7   11  22  24  28  29  38  20
8    1  11  29  33  36  44  37
9    9  12  25  31  43  44   5
10   1   5   6  31  39  46  44
11   14  19  23  36  44  46   7

想要的输出：

 4   14  19  23  36  44  46   7
11   14  19  23  36  44  46   7

我尝试了代码但错误： lapply(df, function(i) all(df[i,] == df[1:nrow(df),]))

但这是不正确的。请指教，谢谢。

【问题讨论】：

你需要lapply(seq_len(nrow(df)), function(i) lapply(seq_len(nrow(df)), function(j) all(df[i,] == df[j,])))还是使用outer(seq_len(nrow(df)), seq_len(nrow(df)), FUN = Vectorize(function(i, j) all(df[i,] == df[j,])))
可能的欺骗：stackoverflow.com/questions/12495345/…
试试lapply(seq_len(nrow(df)), function(i) {i1 <- rowSums(df[i,][col(df)] == df)== ncol(df); if(sum(i1) >1) df[i1,]})

标签： r for-loop dataframe lapply

【解决方案1】：

base R 选项将是

unique(Filter(Negate(is.null), lapply(seq_len(nrow(df)), function(i) {
       i1 <- rowSums(df[i,][col(df)] == df)== ncol(df)
       if(sum(i1) >1) df[i1,]}) ))
[1]]
#    1st  2nd  3rd  4th  5th  6th  7th
#4    14   19   23   36   44   46    7
#11   14   19   23   36   44   46    7

如果我们只对重复行感兴趣

df[duplicated(df)|duplicated(df, fromLast = TRUE),]
#    1st  2nd  3rd   4th  5th  6th 7th
#4    14   19   23   36   44   46    7
#11   14   19   23   36   44   46    7

【讨论】：

【解决方案2】：

使用dplyr::group_by_all() 的选项非常方便：

library(dplyr)

df %>% group_by_all() %>%
  filter(n()>1)  # n()>1 will make sure to return only rows having duplicates

# # A tibble: 2 x 7
# # Groups: X1st, X2nd, X3rd, X4th, X5th, X6th, X7th [1]
#    X1st  X2nd  X3rd  X4th  X5th  X6th  X7th
#   <int> <int> <int> <int> <int> <int> <int>
# 1    14    19    23    36    44    46     7
# 2    14    19    23    36    44    46     7

数据：

df <- read.table(text = 
"1st 2nd 3rd 4th 5th 6th 7th
1    5  32  34  38  39  49   8
2   10  20  21  33  40  44  34
3   10  20  26  28  35  48  13
4   14  19  23  36  44  46   7
5    9  24  25  27  36  38  41
6    7  13  14  20  29  32  28
7   11  22  24  28  29  38  20
8    1  11  29  33  36  44  37
9    9  12  25  31  43  44   5
10   1   5   6  31  39  46  44
11   14  19  23  36  44  46   7",
header = TRUE)

【讨论】：