【发布时间】:2014-05-20 22:05:14
【问题描述】:
我有一个 MultiIndex pandas DataFrame,我想在其中将一个函数应用到它的一列并将结果分配给同一列。
In [1]:
import numpy as np
import pandas as pd
cols = ['One', 'Two', 'Three', 'Four', 'Five']
df = pd.DataFrame(np.array(list('ABCDEFGHIJKLMNO'), dtype='object').reshape(3,5), index = list('ABC'), columns=cols)
df.to_hdf('/tmp/test.h5', 'df')
df = pd.read_hdf('/tmp/test.h5', 'df')
df
Out[1]:
One Two Three Four Five
A A B C D E
B F G H I J
C K L M N O
3 rows × 5 columns
In [2]:
df.columns = pd.MultiIndex.from_arrays([list('UUULL'), ['One', 'Two', 'Three', 'Four', 'Five']])
df['L']['Five'] = df['L']['Five'].apply(lambda x: x.lower())
df
-c:2: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
Out[2]:
U L
One Two Three Four Five
A A B C D E
B F G H I J
C K L M N O
3 rows × 5 columns
In [3]:
df.columns = ['One', 'Two', 'Three', 'Four', 'Five']
df
Out[3]:
One Two Three Four Five
A A B C D E
B F G H I J
C K L M N O
3 rows × 5 columns
In [4]:
df['Five'] = df['Five'].apply(lambda x: x.upper())
df
Out[4]:
One Two Three Four Five
A A B C D E
B F G H I J
C K L M N O
3 rows × 5 columns
如您所见,该功能未应用于列,我猜是因为我收到此警告:
-c:2: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
奇怪的是,这个错误只是偶尔发生,我一直无法理解它什么时候发生,什么时候不发生。
我设法使用.loc 对数据框进行切片的功能作为警告建议:
In [5]:
df.columns = pd.MultiIndex.from_arrays([list('UUULL'), ['One', 'Two', 'Three', 'Four', 'Five']])
df.loc[:,('L','Five')] = df.loc[:,('L','Five')].apply(lambda x: x.lower())
df
Out[5]:
U L
One Two Three Four Five
A A B C D e
B F G H I j
C K L M N o
3 rows × 5 columns
但我想了解为什么在进行类似 dict 的切片(例如 df['L']['Five'])而不是在使用 .loc 切片时会发生这种行为。
注意:DataFrame 来自未多索引的 HDF 文件,这可能是导致奇怪行为的原因吗?
编辑:我正在使用Pandas v.0.13.1 和NumPy v.1.8.0
【问题讨论】:
标签: python pandas apply multi-index