基于另一个时间序列在熊猫系列中查找值的差异答案

【问题标题】：Finding the difference of values in a series in pandas based on another time series基于另一个时间序列在熊猫系列中查找值的差异
【发布时间】：2020-11-11 22:38:47
【问题描述】：

我在 pandas 中有一个金融时间序列和一个时间序列“位置”，当趋势为正时取值为 1，否则为 -1。位置系列不断交替 1 和 -1。是否有一种功能或一种聪明的方法可以找到一个积极时期的开始和结束之间的差异？更具体地说，我想对所有的增量求和，但为了做到这一点，我正在努力寻找一种方法来确定趋势的起点和终点。谢谢

【问题讨论】：

标签： python pandas dataframe time-series

【解决方案1】：

假设，您有一个这样的数据框：

         date  value
0  2020-11-12     10
1  2020-11-13     12
2  2020-11-14     15
3  2020-11-15     17
4  2020-11-16     17
5  2020-11-17     11
6  2020-11-18     12
7  2020-11-19      9
8  2020-11-20      7

并且你想计算上升周期的开始和结束值之间的差异，然后你得到这个结果：

          start_date  first_value  last_value  difference
trend_no                                                 
1         2020-11-12           10          17           7
2         2020-11-17           11          12           1

通过执行以下代码：

# work out the trend of the series
df['difference']= df['value'] - df['value'].shift(1).fillna(0)
df['trend']= np.sign(df['difference'])

# work out the start and end of an ascending series
df['start_ascend']= (df['trend'].shift(-1) > df['trend']).astype('bool')
df.loc[0, 'start_ascend']= True
df['end_ascend']= (df['trend'].shift(-1) < df['trend']).astype('bool')

# assign a number to the ascending trends
# note, that the trends are not yet limited correctly
df['trend_no']= df['start_ascend'].cumsum()

# now work out the borders of the ascending trends
# all records that belong to an ascending trend
# will have df['keep_mask'] == True
df['keep_mask']= np.nan
indexer= df['end_ascend'].shift(1).fillna(False) 
df.loc[indexer, 'keep_mask']= 0.0
df.loc[df['start_ascend'], 'keep_mask']= 1.0
df['keep_mask']= df['keep_mask'].fillna(method='ffill').astype('bool')

# now do the final aggregation
df_res= df[df['keep_mask']].groupby(df['trend_no']).agg(start_date=('date', 'first'), first_value=('value', 'first'), last_value=('value', 'last'))
df_res['difference']= df_res['last_value'] - df_res['first_value']
df_res

如果你想了解，上面的步骤实际上做了什么，你可以看一下数据框：

         date  value  trend  start_ascend  end_ascend  trend_no  keep_mask
0  2020-11-12     10    1.0          True       False         1       True
1  2020-11-13     12    1.0         False       False         1       True
2  2020-11-14     15    1.0         False       False         1       True
3  2020-11-15     17    1.0         False        True         1       True
4  2020-11-16     17    0.0         False        True         1      False
5  2020-11-17     11   -1.0          True       False         2       True
6  2020-11-18     12    1.0         False        True         2       True
7  2020-11-19      9   -1.0         False       False         2      False
8  2020-11-20      7   -1.0         False       False         2      False

【讨论】：

【解决方案2】：

如果将之前和当前的位置值相加，您可以寻找它的位置为 0，这将是趋势反转的点。

像这样：

trend_flipped = df["Trend"] + df["Trend"].shift() == 0
df[trend_flipped]

【讨论】：