【问题标题】:Python - unexpected shape parameter behavior in scipy genextreme fitPython - scipy genextreme fit中的意外形状参数行为
【发布时间】:2019-02-26 06:32:44
【问题描述】:

我一直在尝试使用 Scipy 的 stats.genextreme 函数将 GEV 分布拟合到某个年度最大河流流量,但我发现了一些奇怪的拟合行为。根据您的数据有多小(即 1e-5 与 1e-1),返回的形状参数可能会有很大不同。例如:

import scipy as scipy
import numpy as np
from scipy.stats import genextreme as gev
from scipy.stats import gumbel_r as gumbel

#Set up arrays of values to fit curve to 
sample=np.random.rand(1,30) #Random set of decimal values 
smallVals = sample*1e-5     #Scale to smaller values 

#If the above is not creating different values, this instance of random numbers has:
bugArr = np.array([[0.25322987, 0.81952358, 0.94497455, 0.36295543, 0.72272746, 0.49482558,0.65674877, 0.40876558, 0.64952248, 0.23171052, 0.24645658, 0.35359126,0.27578928, 0.24820775, 0.69789187, 0.98876361, 0.22104156,0.40019593,0.0756707,  0.12342556, 0.3601186,  0.54137089,0.43477705, 0.44622486,0.75483338, 0.69766687, 0.1508741,  0.75428996, 0.93706003, 0.1191987]])
bugArr_small = bugArr*1e-5

#This array of random numbers gives the same shape parameter regardless 
fineArr = np.array([[0.7449611,  0.82376693, 0.32601009, 0.18544293, 0.56779629, 0.30495415,
        0.04670362, 0.88106521, 0.34013959, 0.84598841, 0.24454428, 0.57981437,
        0.57129427, 0.8857514,  0.96254429, 0.64174078, 0.33048637, 0.17124045,
        0.11512589, 0.31884749, 0.48975204, 0.87988863, 0.86898236, 0.83513966,
        0.05858769, 0.25889509, 0.13591874, 0.89106616, 0.66471263, 0.69786708]])
fineArr_small = fineArr*1e-5

#GEV fit for both arrays - shouldn't dramatically change distribution 
gev_fit      = gev.fit(sample)
gevSmall_fit = gev.fit(smallVals)

gevBug      = gev.fit(bugArr)
gevSmallBug = gev.fit(bugArr_small)

gevFine      = gev.fit(fineArr)
gevSmallFine = gev.fit(fineArr_small)

对于 bugArr/bugArr_small 和 FineArr/fineArr_small 估计的 GEV 参数,我得到以下输出:

Known bug array
Random values:         (0.12118250540401079, 0.36692231766996053, 0.23142400358716353)
Random values scaled:  (-0.8446554391074808, 3.0751769299431084e-06, 2.620390405092363e-06)

Known fine array
Random values:         (0.6745399522587823, 0.47616297212022757, 0.34117425062278584)
Random values scaled:  (0.6745399522587823, 4.761629721202293e-06, 3.411742506227867e-06)

当数据的唯一区别是缩放比例的变化时,为什么形状参数会发生如此巨大的变化?我希望该行为与 FineArr 结果一致(形状参数没有变化,位置和比例参数的适当缩放)。我在 Matlab 中重复了测试,但结果符合我的预期(即形状参数没有变化)。

【问题讨论】:

  • 这个形状参数是什么以及在哪里?你能指导我们吗?
  • @Bazingaa, gev.fit 返回一个长度为 3 的元组,包含形状、位置和比例参数(按此顺序)。

标签: python scipy statistics curve-fitting


【解决方案1】:

我想我知道为什么会发生这种情况。拟合时可以通过初始形状参数估计,请参阅scipy.stats.rv_continuous.fit的文档,其中它状态为任何形状表征参数的“启动值(s)(未提供的字符即可通过呼叫确定为_fitstart来确定(数据))。没有默认值。“以下是一些非常丑陋,功能,代码使用我的pyeq3统计分布钳工,内部尝试使用不同的估计,适合它们,并返回不同适合的最佳NNLF的参数。此示例代码不会显示您观察到的行为,并提供相同的形状参数,无论缩放如何。您需要使用“pip3安装pyeq3”安装pyq3来运行此代码。 Pyeq3代码专为从zunzun.com上的Web界面输入的文本输入而设计,因此暂停您的鼻子 - 这是示例代码:

import numpy as np

#Set up arrays of values to fit curve to 
sample=np.random.rand(1,30) #Random set of decimal values 
smallVals = sample*1e-5     #Scale to smaller values 

#If the above is not creating different values, this instance of random numbers has:
bugArr = np.array([0.25322987, 0.81952358, 0.94497455, 0.36295543, 0.72272746, 0.49482558,0.65674877, 0.40876558, 0.64952248, 0.23171052, 0.24645658, 0.35359126,0.27578928, 0.24820775, 0.69789187, 0.98876361, 0.22104156,0.40019593,0.0756707,  0.12342556, 0.3601186,  0.54137089,0.43477705, 0.44622486,0.75483338, 0.69766687, 0.1508741,  0.75428996, 0.93706003, 0.1191987])
bugArr_small = bugArr*1e-5

#This array of random numbers gives the same shape parameter regardless 
fineArr = np.array([0.7449611,  0.82376693, 0.32601009, 0.18544293, 0.56779629, 0.30495415,
        0.04670362, 0.88106521, 0.34013959, 0.84598841, 0.24454428, 0.57981437,
        0.57129427, 0.8857514,  0.96254429, 0.64174078, 0.33048637, 0.17124045,
        0.11512589, 0.31884749, 0.48975204, 0.87988863, 0.86898236, 0.83513966,
        0.05858769, 0.25889509, 0.13591874, 0.89106616, 0.66471263, 0.69786708])
fineArr_small = fineArr*1e-5

bugArr_str = ''
for i in range(len(bugArr)):
    bugArr_str += str(bugArr[i]) + '\n'
bugArr_small_str = ''
for i in range(len(bugArr_small)):
    bugArr_small_str += str(bugArr_small[i]) + '\n'
fineArr_str = ''
for i in range(len(fineArr)):
    fineArr_str += str(fineArr[i]) + '\n'
fineArr_small_str = ''
for i in range(len(fineArr_small)):
    fineArr_small_str += str(fineArr_small[i]) + '\n'
import pyeq3

simpleObject_bugArr = pyeq3.IModel.IModel()
simpleObject_bugArr._dimensionality = 1
pyeq3.dataConvertorService().ConvertAndSortColumnarASCII(bugArr_str, simpleObject_bugArr, False)
solver = pyeq3.solverService()
result_bugArr = solver.SolveStatisticalDistribution('genextreme', simpleObject_bugArr.dataCache.allDataCacheDictionary['IndependentData'][0], 'nnlf')
simpleObject_bugArr_small = pyeq3.IModel.IModel()
simpleObject_bugArr_small._dimensionality = 1
pyeq3.dataConvertorService().ConvertAndSortColumnarASCII(bugArr_small_str, simpleObject_bugArr_small, False)
solver = pyeq3.solverService()
result_bugArr_small = solver.SolveStatisticalDistribution('genextreme', simpleObject_bugArr_small.dataCache.allDataCacheDictionary['IndependentData'][0], 'nnlf')

simpleObject_fineArr = pyeq3.IModel.IModel()
simpleObject_fineArr._dimensionality = 1
pyeq3.dataConvertorService().ConvertAndSortColumnarASCII(fineArr_str, simpleObject_fineArr, False)
solver = pyeq3.solverService()
result_fineArr = solver.SolveStatisticalDistribution('genextreme', simpleObject_fineArr.dataCache.allDataCacheDictionary['IndependentData'][0], 'nnlf')

simpleObject_fineArr_small = pyeq3.IModel.IModel()
simpleObject_fineArr_small._dimensionality = 1
pyeq3.dataConvertorService().ConvertAndSortColumnarASCII(fineArr_small_str, simpleObject_fineArr_small, False)
solver = pyeq3.solverService()
result_fineArr_small = solver.SolveStatisticalDistribution('genextreme', simpleObject_fineArr_small.dataCache.allDataCacheDictionary['IndependentData'][0], 'nnlf')

print('ba',result_bugArr[1]['fittedParameters'])
print('ba_s',result_bugArr_small[1]['fittedParameters'])
print()
print('fa',result_fineArr[1]['fittedParameters'])
print('fa_s',result_fineArr_small[1]['fittedParameters'])

【讨论】:

    猜你喜欢
    • 2015-06-01
    • 2020-08-15
    • 1970-01-01
    • 2014-03-27
    • 1970-01-01
    • 2015-08-18
    • 2019-08-21
    • 1970-01-01
    • 2017-12-28
    相关资源
    最近更新 更多