R函数替换数据集中的多个值答案

【问题标题】：R function to replace multiple values in a datasetR函数替换数据集中的多个值
【发布时间】：2020-11-11 10:33:22
【问题描述】：

我对 R 和一般编码非常陌生。我正在处理 HT-qPCR 数据，并且有数百个基因代码需要更改为基因名称。我正在使用包 plyr 中的函数 revalue 并且运行良好：

Ct1 <- revalue(Ct_data$Gene, c("AY1" = "16s"))

但是，由于我有数百个要重命名的值，我想知道有没有办法在所有样本的循环中执行此操作？我有一个带有相应基因名称的基因代码的 excel 文件，所以有人能指出我如何使用这个 excel 文件重命名值的正确方向吗？

【问题讨论】：

嗨。你的意思是替换值，对吧？重命名将用于列名。
嗨，是的，我的意思是替换值

标签： r plyr

【解决方案1】：

如果您的替换已经在数据框中（将您的 Excel 文件读入 R），那么这是一个连接。像这样的：

# 1 read your excel file into R
library(readxl)
lookup = read_excel("path/to/your_excel.xlsx")
## I'll pretend the column names are `bad_name` and `good_name`

# 2 join to your current data
library(dplyr)
Ct_data = left_join(Ct_data, by = c("Gene" = "bad_name"))

# 3 OPTIONAL manually spot-check to make sure the good names are correct
View(Ct_data[c("Gene", "good_name")])

# 4 OPTIONAL if not all names are replaced, you may need to keep the original name in case `good_name` is missing
Ct_data = Ct_data %>% 
  mutate(good_name = coalesce(good_name, Gene))

# Drop the old column, keep the new
Ct_data = Ct_data %>% 
  select(-Gene) %>% 
  rename(Gen = good_name)

如果您需要更多帮助，请发布一个可重复的小示例，其中包含几行数据来说明剩余问题。 dput() 是共享数据的最佳方式，因为它可以复制/粘贴并保留所有类和结构信息，例如dput(Ct_data[1:10, ]) 前 10 行。选择要分享的相关子集。

【讨论】：