熊猫格兰杰因果关系答案

【问题标题】：Pandas Granger Causality熊猫格兰杰因果关系
【发布时间】：2016-06-14 04:24:08
【问题描述】：

我想使用 Python Pandas 对时间序列数据执行格兰杰因果检验，我有两个问题。

(1) 我曾尝试使用 pandas.stats.var 包，但这似乎已被弃用。还有其他推荐的选项吗？

(2) 我在解释pandas.stats.var 包中VAR.granger_causality() 函数的输出时遇到了困难。我能找到的唯一参考是源代码中的注释：

   Returns the f-stats and p-values from the Granger Causality Test.
   If the data consists of columns x1, x2, x3, then we perform the
   following regressions:
   x1 ~ L(x2, x3)
   x1 ~ L(x1, x3)
   x1 ~ L(x1, x2)
   The f-stats of these results are placed in the 'x1' column of the
   returned DataFrame.  We then repeat for x2, x3.
   Returns
   -------
   Dict, where 'f-stat' returns the DataFrame containing the f-stats,
   and 'p-value' returns the DataFrame containing the corresponding
   p-values of the f-stats.

例如，试运行的输出如下所示：

p-value:
          C         B         A
A   0.472122  0.798261  0.412984
B   0.327602  0.783978  0.494436
C   0.071369  0.385844  0.688292

f-stat:
          C         B         A
A   0.524075  0.065955  0.680298
B   0.975334  0.075878  0.473030
C   3.378231  0.763898  0.162619

我知道 p-value 表中的每个单元格对应 f-stat 表中的一个单元格，但我不明白 f-stat 表中的单元格指的是什么。例如，C 列 A 行中的值 0.52 是什么意思？

【问题讨论】：

通常对于 pandas，您需要检查 statsmodels 和 scipy（有时还需要检查 numpy 以获得更简单的统计信息）。好像 statsmodels 有东西：statsmodels.sourceforge.net/0.6.0/generated/…
@JohnE 的答案更新链接：link
您可以查看此链接以通过 P 值进行解释：machinelearningplus.com/time-series/time-series-analysis-python

标签： python pandas scipy statistics statsmodels

【解决方案1】：

（零假设）H0：Xt 不格兰杰导致 Yt。
（替代假设）H1：Xt 格兰杰导致 Yt。

如果 P 值小于 5%（或 0.05），那么我们可以拒绝 Null 假设（H0），并且可以得出 Xt 格兰杰导致 Yt 的结论。

因此，只要 P 值小于 0.05，就可以考虑这些特征。

【讨论】：

【解决方案2】：

请记住，最简单形式的 Granger 因果关系由两个回归的 R2 的 F 检验组成： y=const+y[-1]+e 对比 y=const+y[-1]+x[-1]+e

为了查看第二次回归的 R2 是否更高。也可以看看： http://www.statisticshowto.com/granger-causality/

【讨论】：