R根据ID值和项目访问来自不同数据帧的值并将列附加到第一个数据帧答案

【问题标题】：R Access values from different dataframe dependent on an ID value and an item and append column to first dataframeR根据ID值和项目访问来自不同数据帧的值并将列附加到第一个数据帧
【发布时间】：2015-01-16 22:17:49
【问题描述】：

我正在寻找解决以下问题的方法：我的第一个数据框包含评级，例如，两个主题（主题 ID 1 和 2）以离散量表对两个项目进行评级。

ratings  <- data.frame(ID=c(1, 1, 2, 2), item=c(1, 2, 1, 2), rating=c(1, -4, 3, 2))

这会产生以下数据框：

 ID item rating
  1    1      1
  1    2     -4
  2    1      3
  2    2      2

然后我有一个选择数据框，例如，两个主题在 2 个项目之间进行选择。

choice  <- data.frame(ID=c(1, 1, 2, 2), item_L=c(1, 2, 1, 2), 
                      item_R=c(2,1,2,1), choice_item_Left=c(0,1,1,0))

这会产生以下数据框：

 ID item_L item_R choice_item_Left
  1      1      2             0
  1      2      1             1
  2      1      2             1
  2      2      1             0

我现在的问题如下：我想访问评级数据框并将左右项目的评级用作选择数据框中的新列，具体取决于主题 ID 和项目编号。所以我需要选择数据框中的两个新列，即 rating_item_L 和 rating_item_R，其值取决于评级数据框和评级数据框中的 ID。

一个示例数据框如下所示：

ID item_1 item_2   choice_item_Left  rating_item_L rating_item_R
1  1      1      2             0             1            -4
2  1      2      1             1            -4             1
3  2      1      2             1             3             2
4  2      2      1             0             2             3

重要的是，我有比评分更多的选择，并且评分是按顺序排列的（例如，从 1 到 20），但选择不是按顺序排列的。所以有选项 3 vs 9 或 2 vs 8。

有人知道解决办法吗？

【问题讨论】：

标签： r dataframe

【解决方案1】：

你可以像这样使用merge。

## merge left items
xx= merge(ratings,choice,by.x=c('ID','item'),by.y=c('ID','item_L'))
## merge right data
yy = merge(ratings,choice,by.x=c('ID','item'),by.y=c('ID','item_R'))
## bind left and right data 
res <- merge(xx,yy,by=c('ID','item'))
#    ID item rating.x item_R choice_item_Left.x rating.y item_L choice_item_Left.y
# 1  1    1        1      2                  0        1      2                  1
# 2  1    2       -4      1                  1       -4      1                  0
# 3  2    1        3      2                  1        3      2                  0
# 4  2    2        2      1                  0        2      1                  1

当然，您可以重新排列列并重命名它们以获得准确的输出。

setNames(res[,c("ID","item_R","item_L","choice_item_Left.x","rating.x","rating.y")],
         c("ID","item_1","item_2","choice_item_Left","rating_item_L","rating_item_R"))

#    ID item_1 item_2 choice_item_Left rating_item_L rating_item_R
# 1  1      2      2                0             1             1
# 2  1      1      1                1            -4            -4
# 3  2      2      2                1             3             3
# 4  2      1      1                0             2             2

【讨论】：

非常感谢，现在我发现我还需要做一件事：我需要设置 sort=F。然后我得到了我想要的。谢谢！