【发布时间】:2020-02-18 04:13:25
【问题描述】:
我正在使用 glm 回归模型的输出系数,我需要创建一个查找值,使用键粘贴 ([列名].[因子级别],然后从另一个数据表返回相应的值。列名称必须是动态的,这样我就不必一一明确地命名每一列。 然后将查找返回的值乘以 1(对于因子)或实际数值,并将所有 coef_colnames 相加到 Total 列中。
我在 excel 中做了一些示例,但无法在 R 中复制它。 var_Factor1 结合每行的列名和因子级别(使用粘贴)来构建下一步查找的键
var_Number1 只是列名,因为它是数字并且没有因子级别
library(dplyr)
# original data
dt = data.table(
Factor1 = c("A","B","C"),
Number1 = c(10, 20,40),
Factor2 = c("D","H","N"),
Number2 = c(2, 5,3)
)
# Lookup table
model_coef = data.table(
Factor1.A = 10,
Factor1.B = 20,
Factor1.C = 30,
Factor2.D = 40,
Factor2.H = 50,
Factor2.N = 60,
Number1 = 200,
Number2 = 500
)
#initial steps
dt <- dt %>% mutate (
var_Factor1 = paste("Factor1", Factor1, sep =".")
, var_Number1 = "Number1"
, var_Factor2 = paste("Factor2", Factor2, sep =".")
, var_Number2 = "Number2"
) %>% mutate (
coef_Factor1 = model_coef[,var_Factor1]
)
#The final output should produce (as replicated from Excel)
final_output = data.table (
Factor1= c("A", "B", "C"),
Number1= c(10, 20, 40),
Factor2= c("D", "H", "N"),
Number2= c(2, 5, 3),
var_Factor1= c("Factor1.A", "Factor1.B", "Factor1.C"),
var_Number1= c("Number1", "Number1", "Number1"),
var_Factor2= c("Factor2.D", "Factor2.H", "Factor2.N"),
var_Number2= c("Number2", "Number2", "Number2"),
coef_Factor1= c(10, 20, 30),
coef_Number1= c(200, 200, 200),
coef_Factor2= c(40, 50, 60),
coef_Number2= c(500, 500, 500),
calc_Factor1= c(10, 20, 30),
calc_Number1= c(2000, 4000, 8000),
calc_Factor2= c(40, 50, 60),
calc_Number2= c(1000, 2500, 1500),
Total= c(3050, 6570, 9590)
)
【问题讨论】:
-
我的解决方案有帮助吗?