【发布时间】:2020-12-11 13:19:25
【问题描述】:
我正在尝试从头开始编写偏度度量计算表格。但无法匹配来自 pandas 属性/scipy.stats 函数的值。
我已经浏览了 scipy.stats here 的源代码。但我无法找到我遗漏了什么。
from math import sqrt
from scipy import stats
import pandas as pd
def mean(values):
return sum(values) / len(values)
def standard_dev(values):
vals_mean = mean(values)
numerator = 0
for val in values:
numerator += (val - vals_mean) ** 2
return sqrt(numerator/len(values))
def skewness(values):
n = len(values)
vals_mean = mean(values)
thrid_moment = 0
for val in values:
thrid_moment += (val - vals_mean)**3
return (sqrt(n*(n-1))/ (n-2)) * (thrid_moment / standard_dev(values) ** 3)
values = [1,1,1,2,2,3,3,3,4,4,5]
print(f'mean{mean(values)}')
# mean2.6363636363636362
print(f'standard_dev{standard_dev(values)}')
# standard_dev1.2984415324623364
print(f'skewness{skewness(values)}')
# skewness2.5341000098031734
a = pd.Series(values)
a.std(ddof=0)
# 1.2984415324623364
a.skew()
# 0.23037272816392504
stats.skew(a, bias=False)
# 0.230372728163925
stats.skew(a, bias=True)
# 0.19768660009807223
【问题讨论】:
标签: python pandas scipy statistics