resample
您可以使用pd.DataFrame.resample 与"20D" 指定20 天的时间规则。使用on 参数指定要重新采样的列。生成的resample 对象类似于groupby 对象,可以处理apply 方法。
def dcorr(df, n):
return df.resample(f"{n}D", on='date').apply(lambda d: d.corr())
dcorr(df, 20)
A B
date
2000-01-01 A 1.000000 0.241121
B 0.241121 1.000000
2000-01-21 A 1.000000 0.083664
B 0.083664 1.000000
2000-02-10 A 1.000000 0.432988
B 0.432988 1.000000
2000-03-01 A 1.000000 -0.269869
B -0.269869 1.000000
2000-03-21 A 1.000000 -0.188370
B -0.188370 1.000000
groupby
df.set_index('date').groupby(pd.Grouper(freq='20D')).corr()
A B
date
2000-01-01 A 1.000000 0.241121
B 0.241121 1.000000
2000-01-21 A 1.000000 0.083664
B 0.083664 1.000000
2000-02-10 A 1.000000 0.432988
B 0.432988 1.000000
2000-03-01 A 1.000000 -0.269869
B -0.269869 1.000000
2000-03-21 A 1.000000 -0.188370
B -0.188370 1.000000
或者
df.set_index('date').groupby(pd.Grouper(freq='20D')).corr().unstack()[('A', 'B')]
date
2000-01-01 0.241121
2000-01-21 0.083664
2000-02-10 0.432988
2000-03-01 -0.269869
2000-03-21 -0.188370
Name: (A, B), dtype: float64
您还可以明确说明要关联的列:
df.resample("20D", on='date').apply(lambda d: d.A.corr(d.B))
设置
np.random.seed([3, 1415])
n = 100
df = pd.DataFrame(np.random.rand(n,2), columns=['A','B'])
df['date'] = pd.date_range('2000-01-01', periods=n, name='date')
调试中
import pandas as pd
import numpy as np
np.random.seed([3, 1415])
n = 100
df = pd.DataFrame(
np.random.rand(n, 4),
pd.date_range('2000-01-01', periods=n, name='date'),
['ABC','XYZ __', 'One', 'Two Three']
)
def dcorr(df, n):
return df.resample(f"{n}D").apply(lambda d: d.corr())
dcorr(df, 20)
输出
ABC XYZ __ One Two Three
date
2000-01-01 ABC 1.000000 -0.029687 0.403720 0.078800
XYZ __ -0.029687 1.000000 -0.231223 -0.333266
One 0.403720 -0.231223 1.000000 0.330959
Two Three 0.078800 -0.333266 0.330959 1.000000
2000-01-21 ABC 1.000000 -0.024610 0.206002 -0.059523
XYZ __ -0.024610 1.000000 -0.601174 -0.101306
One 0.206002 -0.601174 1.000000 0.149536
Two Three -0.059523 -0.101306 0.149536 1.000000
2000-02-10 ABC 1.000000 -0.361072 0.156693 -0.040827
XYZ __ -0.361072 1.000000 -0.077173 -0.232536
One 0.156693 -0.077173 1.000000 0.343754
Two Three -0.040827 -0.232536 0.343754 1.000000
2000-03-01 ABC 1.000000 0.204763 -0.013132 0.115202
XYZ __ 0.204763 1.000000 -0.339747 -0.206922
One -0.013132 -0.339747 1.000000 0.310002
Two Three 0.115202 -0.206922 0.310002 1.000000
2000-03-21 ABC 1.000000 0.062841 -0.245393 0.233697
XYZ __ 0.062841 1.000000 -0.213742 0.341582
One -0.245393 -0.213742 1.000000 0.251169
Two Three 0.233697 0.341582 0.251169 1.000000