我会通过 m(ticker) 数组将数据重新排列为 n(date),并使用 numpy 来处理滚动平均值,
给定df 100 家公司和 253 天来自雅虎财经,
import pandas as pd
import numpy as np
df_n = df.to_numpy()
sma_20 = np.cumsum(df_n, dtype=float, axis=0)
sma_20[20:] = sma_20[20:] - sma_20[:-20]
sma_20[19:] = sma_20[19:] / 20
sma_20[:19] = sma_20[:19] / np.arange(1, 20)[:, None]
print(sum(df_n > sma_20)/len(df_n))
>>>
[0.41897233 0.61660079 0.7312253 0.71936759 0.74703557 0.743083
0.52964427 0.53359684 0.52964427 0.45849802 0.64031621 0.63241107
0.59683794 0.66798419 0.77470356 0.56521739 0.64426877 0.60869565
0.46640316 0.45059289 0.61660079 0.743083 0.69565217 0.56916996
0.63241107 0.69565217 0.55731225 0.6284585 0.60869565 0.66798419
0.59683794 0.56126482 0.62055336 0.65612648 0.54150198 0.46245059
0.62055336 0.54545455 0.54545455 0.68379447 0.59683794 0.50988142
0.81422925 0.65217391 0.60869565 0.66798419 0.56126482 0.57312253
0.74703557 0.64822134 0.44664032 0.67588933 0.6284585 0.61264822
0.60474308 0.50197628 0.58498024 0.54545455 0.65612648 0.61660079
0.66007905 0.64822134 0.60869565 0.58893281 0.68774704 0.66403162
0.50988142 0.62055336 0.4743083 0.53754941 0.60869565 0.62055336
0.60869565 0.743083 0.43873518 0.6916996 0.71936759 0.61264822
0.59288538 0.49011858 0.58102767 0.5256917 0.59288538 0.45454545
0.49407115 0.55335968 0.49011858 0.64031621 0.6798419 0.54150198
0.59683794 0.67588933 0.56126482 0.60474308 0.45454545 0.61264822
0.56521739 0.48221344 0.40711462 0.68379447]
将概率和相应的公司分配给新的数据框,
df_result = pd.DataFrame(sum(df_n > sma_20)/len(df_n), columns=['probability'])
df_result['company'] = df.columns
df_result = df_result.sort_values(by='probability', ascending=False).reset_index(drop=True)
df_result
###
probability company
0 0.814229 FTNT
1 0.774704 ASML
2 0.747036 INTU
3 0.747036 GOOGL
4 0.743083 AVGO
.. ... ...
95 0.450593 BIIB
96 0.446640 JD
97 0.438735 PCAR
98 0.418972 ATVI
99 0.407115 ZM
[100 rows x 2 columns]