更改级别值的一种方法是构建一个新的 MultiIndex 并将其重新分配给df.index:
import pandas as pd
df = pd.DataFrame(
{'index':[ 26, 30, 46, 48, 72, '1C', '5M', '7D', '7Y',
'8F', '8J', 'AN', 'AS', 'C3', 'CA'],
'foo':1, 'bar':2})
df = df.set_index(['index', 'foo'])
level_values = [df.index.get_level_values(i) for i in range(index.nlevels)]
level_values[0] = level_values[0].astype(str)
df.index = pd.MultiIndex.from_arrays(level_values)
这使得级别值成为字符串:
In [53]: df.index.levels[0]
Out[56]:
Index(['1C', '26', '30', '46', '48', '5M', '72', '7D', '7Y', '8F', '8J', 'AN',
'AS', 'C3', 'CA'],
dtype='object', name='index')
或者,您可以通过使用reset_index 和set_value 来避免有点低级的混乱:
import pandas as pd
df = pd.DataFrame(
{'index':[ 26, 30, 46, 48, 72, '1C', '5M', '7D', '7Y',
'8F', '8J', 'AN', 'AS', 'C3', 'CA'],
'foo':1, 'bar':2})
df = df.set_index(['index', 'foo'])
df = df.reset_index('index')
df['index'] = df['index'].astype(str)
df = df.set_index('index', append=True)
df = df.swaplevel(0, 1, axis=0)
再次产生字符串值的索引级别值:
In [67]: df.index.levels[0]
Out[67]:
Index(['1C', '26', '30', '46', '48', '5M', '72', '7D', '7Y', '8F', '8J', 'AN',
'AS', 'C3', 'CA'],
dtype='object', name='index')
在这两个选项中,using_MultiIndex 更快:
N = 1000
def make_df(N):
df = pd.DataFrame(
{'index': np.random.choice(np.array(
[26, 30, 46, 48, 72, '1C', '5M', '7D', '7Y',
'8F', '8J', 'AN', 'AS', 'C3', 'CA'], dtype='O'), size=N),
'foo':1, 'bar':2})
df = df.set_index(['index', 'foo'])
return df
def using_MultiIndex(df):
level_values = [df.index.get_level_values(i) for i in range(index.nlevels)]
level_values[0] = level_values[0].astype(str)
df.index = pd.MultiIndex.from_arrays(level_values)
return df
def using_reset_index(df):
df = df.reset_index('index')
df['index'] = df['index'].astype(str)
df = df.set_index('index', append=True)
df = df.swaplevel(0, 1, axis=0)
return df
In [81]: %%timeit df = make_df(1000)
....: using_MultiIndex(df)
....:
1000 loops, best of 3: 693 µs per loop
In [82]: %%timeit df = make_df(1000)
....: using_reset_index(df)
....:
100 loops, best of 3: 2.09 ms per loop