具有滞后输入的 pandas.concat

【问题标题】：pandas.concat with lagged inputs具有滞后输入的 pandas.concat
【发布时间】：2017-03-13 16:47:19
【问题描述】：

我在 Keras 中使用 Tensorflow 作为后端处理时间序列数据。

我的神经网络输入有问题：

X=pd.concat([X_prices,X_os,X_months,X_wd,X_stock],axis=1)

如果我运行：

print(X_prices.shape,X_os.shape,X_wd.shape,X_months.shape,X_stock.shape)
print(X.shape)

我明白了：

((729, 10), (729, 1), (729, 6), (729, 11), (729, 10))
((729,38))

不幸的是，在附加滞后时间序列时：

X=pd.concat([X_prices,X_os,X_months,X_wd,X_stock,X_lag1],axis=1)
print(X_lag1.shape)
print(X.shape)

我明白了：

((729,10))
((1458,48))

基本上我的行数增加了一倍..

我不知道我错过了什么。

感谢您的帮助

【问题讨论】：

你想要的输出是什么？

标签： python pandas keras

【解决方案1】：

不看数据很难确定。

但如果我不得不大胆猜测，我会说你的问题是数据帧的索引。让我向您展示我的意思的示例：

df_1 = pd.DataFrame(np.random.rand(5), index=np.arange(5))
df_2 = pd.DataFrame(np.random.rand(5), index=np.arange(5))
df_3 = pd.DataFrame(np.random.rand(5), index=np.arange(5)+5)

如果我们尝试连接前两个（相同的索引值）：

pd.concat([df_1,df_2],1)

现在如果我们尝试连接第一个和最后一个（不同的索引值）：

pd.concat([df_1,df_3],1)

希望有帮助！

【讨论】：

嗯，这是一个很好的答案：不看数据就知道了！在查看了我的数据后，我用 NaN 准确地输出了输出，所以我在 python 中使用了 shift 函数，而不是从 csv 读取滞后系列（可能我搞砸了索引）。谢谢！
thx，发生在我身上几次这就是为什么我有这种感觉：P