当这些行值出现在 R 中的 df2 中时，返回 df1 的行索引答案

【问题标题】：Return the row indices of df1 when those row values occur in df2 in R当这些行值出现在 R 中的 df2 中时，返回 df1 的行索引
【发布时间】：2021-03-03 11:50:03
【问题描述】：

我正在用 R 编码。我有一个大数据框 (df1) 和一个小数据框 (df2)。 df2 是 df1 的子集，但顺序是随机的。我需要知道 df2 中出现的 df1 的行索引。所有特定的单元格值都有很多重复项。 Tapirus terrestris 多次出现，每个 ModType 值也是如此。我尝试使用 which() 和 grpl() 进行试验，但无法让我的代码正常工作。

df1 <- data.frame(
  SpeciesName = c('Tapirus terrestris', 'Panthera onca', 'Leopardus tigrinus' , 'Leopardus tigrinus'),
  ModType   = c('ANN', 'GAM', 'GAM','RF'),
  Variable_scale = c('aspect_s2_sd', 'CHELSAbio1019_s3_sd','CHELSAbio1015_s4_sd','CHELSAbio1015_s4_sd')) 


df2 <- data.frame(
  SpeciesName = c('Tapirus terrestris', 'Leopardus tigrinus'),
  ModType   = c('ANN', 'RF'),
  Variable_scale = c('aspect_s2_sd', 'CHELSAbio1015_s4_sd'))

应该输出一个数组：1,4，因为 df1 第 1 行和第 4 行出现在 df2 中。

【问题讨论】：

标签： r dataframe indexing match

【解决方案1】：

另一个选项是tidyverse

library(dplyr)
df1 %>%
    mutate(index = row_number()) %>%
    inner_join(df2)

【讨论】：

【解决方案2】：

您可以使用match。

df1[match(df2$SpeciesName, df1$SpeciesName), ]

【讨论】：

【解决方案3】：

您可以在df1 和merge 数据集中创建索引列。

df1$index <- 1:nrow(df1)
df3 <- merge(df1, df2)
df3$index
#[1] 4 1

【讨论】：