【问题标题】:Linear regression simulation线性回归模拟
【发布时间】:2019-09-15 23:33:00
【问题描述】:

模拟线性回归的条件并显示多维线性回归(三个或更多参数)的估计是无偏的。尝试对线性回归的参数进行有偏差的估计,并通过模拟表明您设法实现了偏差。

这是我尝试过的,但我被困在从有偏见的估计中获得无偏见的估计。

b0=2
b1=1.3
b2=5
N=1000
matrica=matrix(rep(0,N*3),ncol=3)
for (i in 1:N) {
      x1=rnorm(100) ##expectation and       variance is arbitrary
      x2=rnorm(100)
      err=rnorm(100)
      y=b0+b1*x1+b2*x2+err
     lm=lm(y~x1+x2)
     matrica[i,1]=lm$coefficients[1]
     matrica[i,2]=lm$coefficients[2]
     matrica[i,3]=lm$coefficients[3]
    }
   matrica


   rez1 <- matrica[1:N ,1]
   rez2 <- matrica[1:N ,2]
   rez3 <- matrica[1:N ,3]

   ## now we need to show that the      estimates are unbiased     (b0~mean(rez1...))
   summary(rez1)
   summary(rez2)
   summary(rez3) 

  cor(rez1,rez2)  #highly connected 
  cor(rez2,rez3)  #highly connected

【问题讨论】:

  • 我建议您设置种子以实现可重复性。你的问题是什么?您的问题和期望的结果不清楚。
  • @OTStats 我不知道如何从有偏估计中得到无偏估计
  • 将它(以问题的形式)添加到您上面的帖子中。

标签: r statistics


【解决方案1】:

好的,与您开始的方式类似,您可以执行以下操作:

# True Values
b0=2
b1=1.3
b2=5

# Simulation Set Points
N=1000
n <- 100
set.seed(42)

collector <- matrix(ncol = 3,nrow = N)
colnames(collector) <- c("b0_hat", "b1_hat", "b2_hat")
for(i in 1:N){

  # Generate Data
  x1 <- rnorm(n, mean = 1, sd = 1)
  x2 <- rnorm(n, mean = 1, sd = 1)
  y_hat <- b1 * x1 + b2 * x2 + b0

  # Add Noise
  y <- rnorm(n, y_hat, 1)

  # Fit Data
  fit <- lm(y ~ x1 + x2)

  # Store Results
  collector[i, ] <- fit$coefficients
}

然后为了显示结果,您可以绘制直方图并显示估计的平均值接近真实参数值 beta。


# Graph to Show UnbiasedNess
par(mfrow = c(3,1))
hist(collector[,1], main =expression(hat(beta[0])),breaks = 30)
abline(v =b0, col = "red", lwd = 2)

hist(collector[,2], main =expression(hat(beta[1])),breaks = 30)
abline(v =b1, col = "red", lwd = 2)

hist(collector[,3], main =expression(hat(beta[2])),breaks = 30)
abline(v =b2, col = "red", lwd = 2)

有偏估计是指从长远来看(即预期值)参数估计与真实参数值不同的想法。一种方法是说误差不是来自正态分布,而是来自 t 分布。

# True Values
b0=2
b1=1.3
b2=5

# Simulation Set Points
N=1000
n <- 100
set.seed(42)

collector <- matrix(ncol = 3,nrow = N)
colnames(collector) <- c("b0_hat", "b1_hat", "b2_hat")
for(i in 1:N){

  # Generate Data
  x1 <- rnorm(n, mean = 1, sd = 1)
  x2 <- rnorm(n, mean = 1, sd = 1)
  y_hat <- b1 * x1 + b2 * x2 + b0

  # Add Noise from a t-distribution using `rt`
  y <- rt(n, df = 3, ncp = y_hat)

  # Fit Data
  fit <- lm(y ~ x1 + x2)

  # Store Results
  collector[i, ] <- fit$coefficients
}

现在您可以看到运行下面的代码,我们的估计存在偏差。

# Graph to Show UnbiasedNess
par(mfrow = c(3,1))
hist(collector[,1], main =expression(hat(beta[0])),breaks = 30)
abline(v =b0, col = "red", lwd = 2)

hist(collector[,2], main =expression(hat(beta[1])),breaks = 30)
abline(v =b1, col = "red", lwd = 2)

hist(collector[,3], main =expression(hat(beta[2])),breaks = 30)
abline(v =b2, col = "red", lwd = 2)


【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2020-10-22
    • 2021-04-01
    • 2021-09-02
    • 2018-03-17
    • 1970-01-01
    • 2017-08-20
    • 2019-10-09
    • 2020-07-21
    相关资源
    最近更新 更多