【问题标题】:Regression on subset of data set数据集子集的回归
【发布时间】:2019-07-27 07:40:21
【问题描述】:

我想做以下事情并需要一些帮助:

分别计算“身高”超过“年龄”[lm(Height~Age)]的斜率和截距

(A) 每个人

(B) 性别

并创建一个包含结果(斜率和截距)的表格。我可以使用“申请”吗?

在下一步中,我想做一个统计测试,以确定性别之间的斜率和截距是否存在显着差异。我知道如何在 R 中进行测试,但也许有一种方法可以结合斜率/截距计算和 T 测试。

示例数据:

example = data.frame(Age = c(1, 3, 6, 9, 12,
                             1, 3, 6, 9, 12,
                             1, 3, 6, 9, 12,
                             1, 3, 6, 9, 12), 
                Individual = c("Jack", "Jack", "Jack", "Jack", "Jack",
                               "Jill", "Jill", "Jill", "Jill", "Jill",
                               "Tony", "Tony", "Tony", "Tony", "Tony",
                               "Jen", "Jen", "Jen", "Jen","Jen"),
                    Gender = c("M", "M", "M", "M", "M",
                               "F", "F", "F", "F", "F",
                               "M", "M", "M", "M", "M",
                               "F", "F", "F", "F", "F"),
                    Height = c(38, 62, 92, 119, 165,
                               31, 59, 87, 118, 170,
                               45, 72, 93, 155, 171,
                               33, 61, 92, 115, 168))

【问题讨论】:

    标签: r linear-regression


    【解决方案1】:

    对每个级别分别进行回归分析,然后在数据框中组合斜率和截距的一种方法是使用库 plyr 中的函数 ddply()

    library(plyr)
    
    ddply(example,"Individual",function(x) coefficients(lm(Height~Age,x)))
      Individual (Intercept)      Age
    1       Jack    26.29188 11.11421
    2        Jen    22.10660 11.56345
    3       Jill    18.33249 12.04315
    4       Tony    33.02030 11.96447
    
    ddply(example,"Gender",function(x) coefficients(lm(Height~Age,x)))
      Gender (Intercept)      Age
    1      F    20.21954 11.80330
    2      M    29.65609 11.53934
    

    【讨论】:

      猜你喜欢
      • 2020-12-23
      • 2021-07-04
      • 1970-01-01
      • 2013-09-26
      • 2016-09-13
      • 1970-01-01
      • 1970-01-01
      • 2017-12-17
      • 1970-01-01
      相关资源
      最近更新 更多