Pandas DataFrame - 获取当前行下方的 x 行并进行比较答案

【问题标题】：Pandas DataFrame - grab value x rows below current row and comparePandas DataFrame - 获取当前行下方的 x 行并进行比较
【发布时间】：2017-10-28 23:25:49
【问题描述】：

我有一个数据框，价格 Df：

如果第一个 Close_x 值 (2121.25) 大于 Close_x 值向下 9 行 (2116.25) 我想要一个新列“利润”添加 100，如下所示：

Df['Profit'] = ''

for index, row in Df.iterrows():
    if Df['Close_x'].shift(9) > Df['Close_x']:
        Df['Profit'] == 100
    else:
        Df['Profit'] == -100

我也试过这个：

for index, row in Df.iterrows():
    if Df['Close_x'] + 9 > Df['Close_x']:
        Df['Profit'] == 100
    else:
        Df['Profit'] == -100

对于这两种尝试，我都收到以下错误：

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

请注意，我在 Close_x 中有数千行，因此我需要根据“从当前值向下 9 行”之类的规则进行迭代，而不是调用诸如 [:9] 之类的特定切片。

【问题讨论】：

标签： python loops pandas dataframe iteration

【解决方案1】：

看来你需要numpy.where:

N = 3
Df['Profit'] = np.where(Df['Close_x'].shift(3) > Df['Close_x'], 100, -100)  
Df.loc[Df.index < N,'Profit'] = np.nan      
print (Df)
   Close_x  Profit
0  2121.25     NaN
1  2119.25     NaN
2  2119.50     NaN
3  2115.25   100.0
4  2120.00  -100.0
5  2118.00   100.0
6  2115.25  -100.0
7  2116.25   100.0
8  2116.25   100.0

或者可能需要：

N = 3
for index,row in Df.iterrows():
        if index < N:
            continue
        if(Df.loc[index-N, 'Close_x'] > Df.loc[index, 'Close_x']):
            Df.loc[index, 'Profit'] = 100
        else:
            Df.loc[index, 'Profit'] = -100            
print (Df)
   Close_x  Profit
0  2121.25     NaN
1  2119.25     NaN
2  2119.50     NaN
3  2115.25   100.0
4  2120.00  -100.0
5  2118.00   100.0
6  2115.25  -100.0
7  2116.25   100.0
8  2116.25   100.0

【讨论】：

您好，感谢您的帮助。我确实希望将利润保持在 +100 或 - 100，不关心价格变化时的实际利润，而只需要二进制 +100 或 -100（如果价格相应地更高或更低）。我现在正在处理那个循环。是的，我需要使用 loc 函数而不是我正在做的事情......
不幸的是，我从不从事金融工作，所以对我来说很难回答。但尝试编辑答案。
完美！非常感谢！
第一个解决方案也更快，因为没有循环；）

【解决方案2】：

for index, row in Df.iterrows():
    if Df['Close_x'].shift(9) > Df['Close_x']:
        Df['Profit'] == 100
    else:
        Df['Profit'] == -100

您正在迭代您的数据框，但没有使用变量索引和行一次？这似乎不正确

【讨论】：