7.3 The Sampling Distribution of the Sample Mean

population:1000;Scale are normally distributed with mean 100 and standard deviation 16

sample:4;可以得到样本均值的分布图如下:

Sampling Distribution of the Sample Mean|Central Limit Theorem

与通过公式计算得到的mean 和 标准差一致:μx¯ = μ = 100 and σx¯ = σ/√n = 16/√4 = 8;

 由图可知The histogram is shaped roughly like a normal curve (with parameters 100 and 8)

所以:

Sampling Distribution of the Sample Mean|Central Limit Theorem

Sampling Distribution of the Sample Mean|Central Limit Theorem

由此得到:

Sampling Distribution of the Sample Mean|Central Limit Theorem

即在大数据量的情况下,虽然变量可能不是正态分布的,但是该变量的mean值一定是正态分布的,也就是中心极限定理:

Sampling Distribution of the Sample Mean|Central Limit Theorem

Usually, however, a sample size of 30 or more (n ≥ 30) is large enough

example:

统计每户房子占有人数:可知该变量属于右偏分布:

Sampling Distribution of the Sample Mean|Central Limit Theorem

household size is far from being normally distributed; it is right skewed. Nonetheless, according to the central limit theorem, the sampling sample size of 30

可以计算得到该样本mean的均值和方差:

 

Sampling Distribution of the Sample Mean|Central Limit Theorem 

1000 sample means.

 从1000个样本中抽出30个样本,计算这三十个样本的均值,得到上图(即样本均值分布图,验证了中心极限定理,即该分布也是正态分布的)

变量分布/变量mean 分布(在n逐渐变大的趋势下)/

 Sampling Distribution of the Sample Mean|Central Limit Theorem

可见,SE也在逐渐变小

所以,取样越大,数据越集中在均值附近,相应的SE越小。

 

相关文章: