【发布时间】:2018-04-12 21:41:15
【问题描述】:
我正在尝试对某些数据执行 Mann-Kendall 检验。我正在使用以下链接 (https://github.com/mps9506/Mann-Kendall-Trend/blob/master/mk_test.py) 中的代码进行了一些修改,以便将结果放在一个数组中,它只返回一个 p 值 (p) 和一个 tau 值 (z)。
def mk_test(x, alpha=0.05):
"""
This function is derived from code originally posted by Sat Kumar Tomer
(satkumartomer@gmail.com)
See also: http://vsp.pnnl.gov/help/Vsample/Design_Trend_Mann_Kendall.htm
The purpose of the Mann-Kendall (MK) test (Mann 1945, Kendall 1975, Gilbert
1987) is to statistically assess if there is a monotonic upward or downward
trend of the variable of interest over time. A monotonic upward (downward)
trend means that the variable consistently increases (decreases) through
time, but the trend may or may not be linear. The MK test can be used in
place of a parametric linear regression analysis, which can be used to test
if the slope of the estimated linear regression line is different from
zero. The regression analysis requires that the residuals from the fitted
regression line be normally distributed; an assumption not required by the
MK test, that is, the MK test is a non-parametric (distribution-free) test.
Hirsch, Slack and Smith (1982, page 107) indicate that the MK test is best
viewed as an exploratory analysis and is most appropriately used to
identify stations where changes are significant or of large magnitude and
to quantify these findings.
Input:
x: a vector of data
alpha: significance level (0.05 default)
Output:
trend: tells the trend (increasing, decreasing or no trend)
h: True (if trend is present) or False (if trend is absence)
p: p value of the significance test
z: normalized test statistics`
Examples
--------
>>> x = np.random.rand(100)
>>> trend,h,p,z = mk_test(x,0.05)
"""
n = len(x)
# calculate S
s = 0
for k in range(n-1):
for j in range(k+1, n):
s += np.sign(x[j] - x[k])
# calculate the unique data
unique_x = np.unique(x)
g = len(unique_x)
# calculate the var(s)
if n == g: # there is no tie
var_s = (n*(n-1)*(2*n+5))/18
else: # there are some ties in data
tp = np.zeros(unique_x.shape)
for i in range(len(unique_x)):
tp[i] = sum(x == unique_x[i])
var_s = (n*(n-1)*(2*n+5) - np.sum(tp*(tp-1)*(2*tp+5)))/18
if s > 0:
z = (s - 1)/np.sqrt(var_s)
#result = (s - 1)/np.sqrt(var_s)
elif s == 0:
z = 0
#result = 0
elif s < 0:
z = (s + 1)/np.sqrt(var_s)
#result = (s + 1)/np.sqrt(var_s)
# calculate the p_value
p = 2*(1-norm.cdf(abs(z))) # two tail test
result= np.append(p,z)
h = abs(z) > norm.ppf(1-alpha/2)
return np.array(result)
然后我使用下面的代码来执行测试。
out = np.empty((0))
for i in range(145):
for j in range(192):
out1 = mk_test(yrmax[:,i,j], alpha=0.05)
out = np.append(out, out1, axis=0)
我认为在执行测试时某处出了点问题,因为我希望得到介于 -1 和 1 之间的 z 值,但我得到了一些大于 1 的值。是编码出错了还是我误解 z 是什么,它实际上不是 tau,因此为什么我得到了我没想到的值?
【问题讨论】:
-
如果这个问题是关于你的统计数据的计算/实施,你可能会在stats.stackexchange.com获得更好的运气
-
我不确定编码是否有问题。我也发一下,谢谢!
标签: python arrays statistics jupyter-notebook definition