【问题标题】:Not able to reproduce the result of skewness measure in python pandas/scipy无法在 python pandas/scipy 中重现偏度测量的结果
【发布时间】:2020-12-11 13:19:25
【问题描述】:

我正在尝试从头开始编写偏度度量计算表格。但无法匹配来自 pandas 属性/scipy.stats 函数的值。

我已经浏览了 scipy.stats here 的源代码。但我无法找到我遗漏了什么。

from math import sqrt
from scipy import stats
import pandas as pd
  
def mean(values):
    return sum(values) / len(values)

def standard_dev(values): 
    vals_mean = mean(values)
    numerator = 0
    for val in values:
        numerator += (val - vals_mean) ** 2 
    return sqrt(numerator/len(values))

def skewness(values): 
    n = len(values)
    vals_mean = mean(values)
    thrid_moment = 0
    for val in values: 
        thrid_moment += (val - vals_mean)**3
    return (sqrt(n*(n-1))/ (n-2)) * (thrid_moment / standard_dev(values) ** 3)
  
values = [1,1,1,2,2,3,3,3,4,4,5] 
  
print(f'mean{mean(values)}')
# mean2.6363636363636362

print(f'standard_dev{standard_dev(values)}')
# standard_dev1.2984415324623364

print(f'skewness{skewness(values)}') 
# skewness2.5341000098031734

a = pd.Series(values)
a.std(ddof=0)
# 1.2984415324623364
a.skew()
# 0.23037272816392504

stats.skew(a, bias=False)
# 0.230372728163925
stats.skew(a, bias=True)
# 0.19768660009807223

【问题讨论】:

    标签: python pandas scipy statistics


    【解决方案1】:

    我发现了错误。我没有规范化第三矩值,即没有除以len(values)

    这是完整版:

    from math import sqrt 
      
    def mean(values):
        return sum(values) / len(values)
    
    def moments(values, moment):
        vals_mean = mean(values)
        numerator = 0
        for val in values:
            numerator += (val - vals_mean) ** moment 
        return numerator/len(values)
    
    def skewness(values): 
        n = len(values)
        m2 = moments(values, 2)
        m3 = moments(values, 3)
        return (sqrt(n*(n-1))/ (n-2)) * (m3 / m2 ** 1.5)
      
    values = [1,1,1,2,2, 3,3,3, 4,4, 5] 
      
    print(f'mean: {mean(values)}')
    print(f'skewness: {skewness(values)}') 
    # mean: 2.6363636363636362
    # skewness: 0.23037272816392482
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2020-05-22
      • 2021-03-18
      • 2018-03-31
      • 2019-07-20
      • 2017-09-22
      • 2012-06-23
      • 2011-12-15
      • 2016-12-11
      相关资源
      最近更新 更多