如何从混合模型公式中删除项答案

【问题标题】：How to drop terms from a mixed model formula如何从混合模型公式中删除项
【发布时间】：2015-05-07 19:15:42
【问题描述】：

我有一个相当具体的正则表达式问题，这让我很伤心。我已经从混合模型（lme 或 lme4）中删除了一个或多个固定效应，并希望删除相应的随机斜率。但是，根据随机结构，这可能会留下不必要的 + 符号，或者更糟糕的是，在 | 之前什么都没有。

分别使用lme.model$call$random 和findbars(formula(lme4.model)) 获得lme 和lme4 的随机效应公式列表：

   random.structures = list(
  "~ b | random1",
  "(b | random1)",
  "~ b + x1 | random1",
  "(b + x1 | random1)",
  "~ x1 + b| random1",
  "(x1 + b| random1)",
  "~ b + x1 + c | random1",
  "(b+ x1 + c | random1)",
  "~b + x1 + x2 | random1",
  "(b + x1 + x2 | random1)",
  "~ x1 + x2 + b | random1",
  "(x1 + x2 + b | random1)"
)

我使用dropterms 从固定效应公式中删除了变量b 和c。由于它们不再作为固定效应存在，因此不应允许它们的随机斜率变化。

b 和 c 可以使用以下行从上面的随机公式中删除：

random.structures = lapply(random.structures, function(i) gsub("b|c", "", i))

现在，我希望删除所有剩余的 + 符号，即那些不链接变量的符号。

然后，如果~ 或( 和| 之间有空格，我想插入一个1。

想要的输出是

random.structures2 = list(
  "~ 1 | random1",
  "(1 | random1)",
  "~ x1 | random1",
  "(x1 | random1)",
  "~ x1 | random1",
  "(x1 | random1)",
  "~ x1 | random1",
  "(x1 | random1)",
  "~ x1 + x2 | random1",
  "(x1 + x2 | random1)",
  "~ x1 + x2 | random1",
  "(x1 + x2 | random1)"
)

我曾摆弄过gsub，但似乎无法正确处理。例如，这有效：

gsub("(.*)\\+\\ |(.*)\\+(\\|)", "\\1", random.structures[[3]])
# Accounting for space or lack of space between + and |

但不是为了这个：

gsub("(.*)\\+\\ |(.*)\\+(\\|)", "\\1", random.structures[[7]])

或者，如果有像 dropterms 这样用于随机结构的预先存在的函数，我全力以赴！

同样，我无法在~ | 或( | 之间的空白处可靠地插入1。

【问题讨论】：

您确定要解决这个正则表达式吗？如果您正在使用公式，还有其他用于操作公式的函数不会导致语法无效。如果要删除变量，请尝试update(y~a+b, ~.-b)
这是否适用于混合模型，尤其是 lme4 包中的那些 update 适用于整个公式（固定和随机效应）？
如果您提供了一个可重现的示例来说明您实际上想要完成的工作，那将会很有帮助。提供输入公式、要删除的变量和所需的输出。
除了我提供的两个示例之外，我已经更新了整个 random.structures 列表中演示的问题。
拼凑随机公式而不是替换现有变量被证明是最有效的方法 - 感谢@Frank的建议

标签： regex r lme4 mixed-models nlme

【解决方案1】：

您的起始列表中有一半的项目是正确的公式（带有“~”的那些）。我不确定您对括号中的术语做了什么。但对于公式，您可以使用 Formula 包更好地支持删除带有条件项的项。

在这里，我将子集化为正确的公式并转换为 Formula 对象。

library(Formula)
rx <- lapply(random.structures[grep("~", random.structures)],
    function(x) Formula(as.formula(x)))

我们可以快速达到峰值

sapply(rx, deparse)

# [1] "~b | random1"
# [2] "~b + x1 | random1"
# [3] "~x1 + b | random1"
# [4] "~b + x1 + c | random1"
# [5] "~b + x1 + x2 | random1"
# [6] "~x1 + x2 + b | random1"

现在我们可以从所有这些中删除b 和c

nx <- lapply(x, function(x) update(x, ~.-b-c))

并查看结果

sapply(nx, deparse)

# [1] "~1 | random1" 
# [2] "~x1 | random1"
# [3] "~x1 | random1"
# [4] "~x1 | random1"
# [5] "~x1 + x2 | random1"
# [6] "~x1 + x2 | random1"

在使用常规公式的地方使用这些应该没有问题。

【讨论】：

嗯，这是一种有趣的方法，应该适用于 lme，其中随机效应公式是单独存储的——谢谢！我不知道Formula 会保留这些酒吧。但不幸的是，它不适用于lmer 的语法——它不会删除或替换随机斜率，例如：x = "y ~ x2 + (x2 | random1)"; x = Formula(as.formula(x)); update(x, ~.-x2)