【发布时间】:2021-02-22 04:02:04
【问题描述】:
标题说明了一切:更改随机效应分组变量的(假设是任意的)标签(例如,重复测量实验中的受试者姓名)可以更改 lme4 中的结果输出。最小的例子:
require(dplyr)
require(lme4)
require(digest)
df = faithful %>% mutate(subject = rep(as.character(1:8), each = 34),
subject2 = rep(as.character(9:16), each = 34))
summary(lmer(eruptions ~ waiting + (waiting | subject), data = df))$coefficients[2,1] # = 0.07564181
summary(lmer(eruptions ~ waiting + (waiting | subject2), data = df))$coefficients[2,1] # = 0.07567655
我认为这是因为 lme4 将它们转换为因子,并且不同的名称会产生不同的因子级别排序。例如。这产生了问题:
df2 = faithful %>% mutate(subject = factor(rep(as.character(1:8), each = 34)),
subject2 = factor(rep(as.character(9:16), each = 34)))
summary(lmer(eruptions ~ waiting + (waiting | subject), data = df2))$coefficients[2,1] # = 0.07564181
summary(lmer(eruptions ~ waiting + (waiting | subject2), data = df2))$coefficients[2,1] # = 0.07567655
但这不是:
df3 = faithful %>% mutate(subject = factor(rep(as.character(1:8), each = 34)),
subject2 = factor(rep(as.character(1:8), each = 34),
levels = as.character(1:8),
labels = as.character(9:16)))
summary(lmer(eruptions ~ waiting + (waiting | subject), data = df3))$coefficients[2,1] # = 0.07564181
summary(lmer(eruptions ~ waiting + (waiting | subject2), data = df3))$coefficients[2,1] # = 0.07564181
这似乎是 lme4 中的一个问题。不同的任意变量标签不应该产生不同的输出,对吧?我错过了什么吗?为什么 lme4 会这样做?
(我知道输出的差异很小,但在其他情况下差异更大,足以将 p 值从 0.055 更改为 0.045。另外,如果这是正确的,我认为这可能会导致轻微的可重复性问题——例如,如果在完成分析后,实验者将他们的人类受试者数据匿名(通过更改名称),然后将其发布到公共存储库中。)
【问题讨论】:
标签: r lme4 mixed-models