【发布时间】:2014-05-17 21:21:55
【问题描述】:
我已经阅读了How to deal with this Pandas warning? 中的答案,但我不知道是否应该忽略 SettingWithCopyWarning 警告,或者我做错了什么。
我有这个函数可以将一些数据重新采样到特定的时间范围(例如 1 小时),然后相应地填充 NaN 值。
def resample_data(raw_data, time_frame):
# resamples the ticker data in ohlc
ohlc_dict = {
'open': 'first',
'high': 'max',
'low': 'min',
'close': 'last',
'price': 'mean'
}
volume_dict = {'volume': 'sum', 'volume_quote': 'sum'}
resampled_data = raw_data.resample(time_frame, how={'price': ohlc_dict, 'amount': volume_dict})
resampled_data['amount'] = resampled_data['amount']['volume'].fillna(0.0)
resampled_data['amount']['volume_quote'] = resampled_data['amount']['volume']
resampled_data['price']['close'] = resampled_data['price']['close'].fillna(method='pad')
resampled_data['price']['open'] = resampled_data['price']['open'].fillna(resampled_data['price']['close'])
resampled_data['price']['high'] = resampled_data['price']['high'].fillna(resampled_data['price']['close'])
resampled_data['price']['low'] = resampled_data['price']['low'].fillna(resampled_data['price']['close'])
resampled_data['price']['price'] = resampled_data['price']['price'].fillna(resampled_data['price']['close'])
# ugly hack to remove multi index, must be better way
output_data = resampled_data['price']
output_data['volume'] = resampled_data['amount']['volume']
output_data['volume_quote'] = resampled_data['amount']['volume_quote']
return output_data
这是正确的做法吗?我应该忽略警告吗?
编辑:如果我尝试按照警告中的建议使用 .loc:
resampled_data = raw_data.resample(time_frame, how={'price': ohlc_dict, 'amount': volume_dict})
resampled_data.loc['amount'] = resampled_data['amount']['volume'].fillna(0.0)
resampled_data.loc['amount']['volume_quote'] = resampled_data['amount']['volume']
resampled_data.loc['price']['close'] = resampled_data['price']['close'].fillna(method='pad')
resampled_data.loc['price']['open'] = resampled_data['price']['open'].fillna(resampled_data['price']['close'])
resampled_data.loc['price']['high'] = resampled_data['price']['high'].fillna(resampled_data['price']['close'])
resampled_data.loc['price']['low'] = resampled_data['price']['low'].fillna(resampled_data['price']['close'])
resampled_data.loc['price']['price'] = resampled_data['price']['price'].fillna(resampled_data['price']['close'])
我在引用resampled_data.loc['price']['close'] = resampled_data['price']['close'].fillna(method='pad') 行时收到以下错误
KeyError: '标签 [price] 不在 [index] 中'
【问题讨论】:
-
您正在使用多索引,因此您需要使用元组(因为文档非常清楚)。使用
resampled_data[('price','close')] = value。发布一个实际上可以直接复制粘贴的示例会更有用。 -
但是你对从贸易数据到 ohlcv 有什么建议吗?我想知道是否应该将其添加到 cython。
-
@Jeff 知道了,现在可以正常工作了。我在强调复制警告,但毕竟与它没有任何关系。
-
@MichaelWS 但在什么意义上?当然,如果有一个现成的函数可以做到这一点,那就太好了。
-
在pandas中,有how='ohlc',我猜应该是用户自己决定的。
标签: python pandas resampling