r - 根据另一列中的匹配复制值答案

【问题标题】：r - copy value based on match in another columnr - 根据另一列中的匹配复制值
【发布时间】：2015-11-30 12:37:06
【问题描述】：

在这个数据框中：

Item <- c("A","B","A","A","A","A","A","B")
Trial <- c("Fam","Fam","Test","Test","Test","Test","Test","Test")
Condition <-c("apple","cherry","Trash","Trash","Trash","Trash","Trash","Trash")
ID <- c(rep("01",8))


df <- data.frame(cbind(Item,Trial,Condition,ID))

我想将df$condition 的“垃圾”值替换为df$Trial == "Test"。 df$condition 的新值应该是 df$Trial == "Fam" 处的 df$condition 的副本，基于 df$Item 中的 Fam 和 Test Trials 的匹配。

所以我的最终数据框应该是这样的

  Item Trial Condition ID
1    A   Fam     apple 01
2    B   Fam    cherry 01
3    A  Test     apple 01
4    A  Test     apple 01
5    A  Test     apple 01
6    A  Test     apple 01
7    A  Test     apple 01
8    B  Test    cherry 01

最终，我想为原始数据框中的唯一 ID 执行此操作。所以我想我将不得不在 ddply 或稍后应用该函数。

【问题讨论】：

标签： r

【解决方案1】：

您可以在Trial != "Test" 时对df 进行自我二元连接，并使用data.table 包更新Condition 列通过引用，例如

library(data.table) ## V 1.9.6+
setDT(df)[df[Trial != "Test"], Condition := i.Condition, on = c("Item", "ID")]
df
#    Item Trial Condition ID
# 1:    A   Fam     apple 01
# 2:    B   Fam    cherry 01
# 3:    A  Test     apple 01
# 4:    A  Test     apple 01
# 5:    A  Test     apple 01
# 6:    A  Test     apple 01
# 7:    A  Test     apple 01
# 8:    B  Test    cherry 01

或者（对@docendos 进行一些修改）建议，只是

setDT(df)[, Condition := Condition[Trial != "Test"], by = .(Item, ID)]

【讨论】：

运行此命令时出现错误：[.data.table(setDT(dataAll), dataAll[Trial != "Test"], :=(condition, : 未使用的参数 (on = c("Item", "ppcode"))。现在 ID == ppcode，它有几个级别指代不同的人。我只运行时遇到同样的错误我也无法让我的 R Studio 将 data.table 更新为1.9.6版本，我只到1.9.4，
这就是我写V 1.9.6+的原因。请在 Cran 上更新到最新的 data.table 版本。
你能帮我通过 CRAN 更新吗？当我在 R studio 中点击更新包时，data.table 包没有被登记。
运行install.packages("data.table")并使用library(data.table)重新加载
那并没有解决它。我总是使用 1.9.4 版本。也许是因为我在工作，一些防火墙阻止了加载最新版本....

【解决方案2】：

这是一个使用dplyr的选项

library(dplyr)
distinct(df) %>% 
    filter(Trial=='Fam') %>% 
    left_join(df, ., by = c('Item', 'ID')) %>% 
    mutate(Condition = ifelse(Condition.x=='Trash',
            as.character(Condition.y), as.character(Condition.x))) %>% 
    select(c(1,2,4,7))

或者按照@docendodiscimus 的建议

df %>% 
    group_by(ID, Item) %>%
    mutate(Condition = Condition[Condition != "Trash"])

【讨论】：

对于给定的示例，这也应该有效：df %>% group_by(ID, Item) %>% mutate(Condition = Condition[Condition != "Trash"])
@docendodiscimus 您应该在添加setDT(df)[, Condition := Condition[Trial != "Test"], by = .(Item, ID)] 的同时发布它
@DavidArenburg，这里不需要其他答案。要么 akrun，要么您可以将其添加到您的答案中

【解决方案3】：

您也可以只创建一个 for 循环并遍历所有需要更改的值。这种设置便于以后添加其他项目和/或更改条件类型。

> for(i in 1:nrow(df)) {
>     
>     if(df[i, 1] == "A") {
>         df2[i, 3] <- "apple"
>     }
>     else if(df[i, 1] == "B") {
>         df2[i, 3] <- "cherry"
>     }
> }

【讨论】：

不错，但我宁愿有一个通用的 col 名称，循环遍历所有可能的值在我的情况下相当麻烦
你应该在你的问题中提到这一点；）。有点浪费我的时间在这里哈哈。
下次吸取教训！仍然感谢您对此的帮助。