【问题标题】:Python dataframe - counting number of positive return daysPython数据框 - 计算正返回天数
【发布时间】:2019-11-21 00:44:53
【问题描述】:

我有一个如下所示的数据集:

print(portfolio_all[1])
            Date       Open       High  ...      Close  Adj Close    Volume
0     2010-01-04   4.840000   4.940000  ...   4.770000   4.513494   9837300
1     2010-01-05   4.790000   5.370000  ...   5.310000   5.024457  25212000
2     2010-01-06   5.190000   5.380000  ...   5.090000   4.816288  16597900
3     2010-01-07   5.060000   5.430000  ...   5.240000   4.958220  14033400
4     2010-01-08   5.270000   5.430000  ...   5.140000   4.863598  12760000
5     2010-01-11   5.130000   5.230000  ...   5.040000   4.768975  10952900
6     2010-01-12   5.060000   5.150000  ...   5.080000   4.806825   7870300
7     2010-01-13   5.120000   5.500000  ...   5.480000   5.185314  16400500
8     2010-01-14   5.460000   5.710000  ...   5.590000   5.289400  12767100
9     2010-01-15   5.640000   5.840000  ...   5.500000   5.204239  10985300
10    2010-01-19   5.500000   5.730000  ...   5.640000   5.336711   7807700
11    2010-01-20   5.650000   5.890000  ...   5.740000   5.431333  13289100

我想计算有多少天有正回报(即 Close_day_t > Close_day_t-1)

我尝试了以下功能:

def positive_return_days(portfolio):
    positive_returns = pd.DataFrame(
    columns=['ticker', 'name', 'total positive', 'total days'])
    for asset in portfolio:
        for index, row in asset.iterrows():
            try:
                this_day_close = asset.iloc[[index]]['Close']
                previous_day_close = asset.iloc[[index-1]]['Close']
                asset.loc[index, 'positive_days'] = np.where((this_day_close > previous_day_close))
            except IndexError:
             print("I get out of bounds")
    total_positive_days = asset['positive_days'].sum()
    new_row = {'ticker':asset.name, 'name':asset.name, 'total positive':total_positive_days, 'total days':len(asset.index)}
    positive_returns = positive_returns.append(new_row, ignore_index=True)
    print("Asset: ", "total positive days: ", total_positive_days, "total days:",len(asset.index))
    return positive_returns

但我收到一个错误:

ValueError: Can only compare identically-labeled Series objects

我该如何解决?

【问题讨论】:

  • 你能发布预期的输出吗?
  • 预期输出将是每只股票的正回报天数(以及正回报天数的百分比)...
  • 使用iterrows 真的很慢。 IIUC,您正在寻找以下print(((df["Close"] - df["Close"].shift(1))>0).sum())

标签: python pandas


【解决方案1】:
  • 您可以使用.shift 函数将列移动一个值。
import pandas as pd

df = pd.DataFrame({'Close':[1,2,3,2,1,3]})

print(df)
print("count",(df.Close - df.Close.shift(1) > 0).sum())

*输出:

   Close
0   1
1   2
2   3
3   2
4   1
5   3
count:3

【讨论】:

    【解决方案2】:

    您可以使用pd.Series.diff 计算差异,然后计算正数:

    (df['Close'].diff() > 0).sum()
    

    【讨论】:

      猜你喜欢
      • 2019-03-28
      • 1970-01-01
      • 2018-08-14
      • 2019-04-17
      • 2021-05-05
      • 1970-01-01
      • 2017-03-20
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多