【问题标题】:Efficient way to compute Aroon indicator in pandas在 pandas 中计算 Aroon 指标的有效方法
【发布时间】:2022-10-30 09:07:11
【问题描述】:
我必须计算存储在数据框中的数据的Aroon indicator:
import pandas as pd
import numpy as np
N = 100000
np.random.seed(42)
df = pd.DataFrame()
df['Time'] = np.arange(1, N + 1, 1)
df['High'] = 10 + np.sin(2*np.pi/(N/2)*df['Time']) + 0.5*np.random.randn(N)
df['Low'] = df['High'] - (0.1*np.random.randn(N) + 1)**2
Time High Low
0 1 10.248483 9.031743
1 2 9.931119 9.148842
2 3 10.324221 9.205823
3 4 10.762018 9.882031
4 5 9.883552 8.947960
5 6 9.883686 8.874142
6 7 10.790486 9.814241
7 8 10.384723 9.691851
8 9 9.766394 8.470937
9 10 10.272537 9.032786
关注this answer,我可以使用:
n = 25
df['Aroon Up'] = 100*df['High'].rolling(n + 1).apply(lambda x: x.argmax())/n
df['Aroon Down'] = 100*df['Low'].rolling(n + 1).apply(lambda x: x.argmin())/n
这很好,但是在我必须操作的数据帧上非常慢,超过 500.000 行。
如何加快 Aroon 指标的计算?
【问题讨论】:
-
那里有一个已删除的答案,它使用numba 链接here。也许它更快。
标签:
python
pandas
dataframe
numpy
【解决方案1】:
您可以使用sliding_window_view 代替rolling:
aroon_up = 100 * sliding_window_view(df['High'], n+1).argmax(1) / n
aroon_down = 100 * sliding_window_view(df['Low'], n+1).argmin(1) / n
# The original dimensions are trimmedas required by the size of the sliding window
df['Aroon Up'] = np.hstack([[np.nan]*n, aroon_up])
df['Aroon Down'] = np.hstack([[np.nan]*n, aroon_down])
对于 500K 记录:
%timeit 100 * sliding_window_view(df['High'], n+1).argmax(1) / n
31.8 ms ± 482 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit 100*df['High'].rolling(n + 1).apply(lambda x: x.argmax())/n
30.7 s ± 412 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
【解决方案2】:
这是一个numba版本
import numpy as np
from numba import jit
@jit(nopython=True)
def aroon(data, period):
size = len(data)
out_up = np.array([np.nan] * size)
out_down = np.array([np.nan] * size)
for i in range(period - 1, size):
window = np.flip(data[i + 1 - period:i + 1])
out_up[i] = ((period - window.argmax()) / period) * 100
out_down[i] = ((period - window.argmin()) / period) * 100
return out_up, out_down