列上的多索引字符串替换答案

【问题标题】：Multiindiex str replacement on column列上的多索引字符串替换
【发布时间】：2020-01-28 05:11:01
【问题描述】：

我想替换多索引数据框中一列中的所有值，我发现了一种肮脏的方法，但我正在寻找更清洁的方法

如果有帮助的话，数据是从 .xlsx 导入的，因为它能够使用千位运算符从第一列中删除“，”。

所有数字都是字符串，所以我需要将它们转换为浮点数或整数，因此需要 str.replace 函数

示例数据框

Name    0                       1                      ...
Col     A           B           A            B         ...
0       409511  30.3%           355529   30.3%  ...
1       332276  20.3%           083684   20.3%  ...
2       138159  10.3%           570834   10.3%  ...

如果我使用

df['0','B']= df['0','B'].str.replace('%','').astype(float)

这可行，但我不想对每一列都这样做

我一直在尝试玩弄

df.loc[:,pd.IndexSlice[:,'B']].str.replace('%','').astype(float)

但我得到了错误

'DataFrame' 对象没有属性 'str'

我试过了

df.loc[:,pd.IndexSlice[:,'Percent']].replace('%','')

返回没有错误的数据帧，但对它没有任何作用

如果我这样做了

df.loc[:,pd.IndexSlice[:,'Percent']].replace('%','').astype(float)

无法将字符串转换为浮点数：'33.3%'

我阅读了https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html，但没有任何关于替换的内容

我也一直无法在此找到任何内容 https://jakevdp.github.io/PythonDataScienceHandbook/03.05-hierarchical-indexing.html

【问题讨论】：

标签： python pandas dataframe

【解决方案1】：

你可以试试Index.Slice和loc和update（注意：你需要regex=True）

idx = pd.IndexSlice
df.update(df.loc[:, idx[:,'B']].replace('%', '', regex=True).astype(float))

Out[1374]:
        0             1
        A     B       A     B
0  409511  30.3  355529  30.3
1  332276  20.3   83684  20.3
2  138159  10.3  570834  10.3

或者使用filter和update返回df

df.update(df.filter(like='B').replace('%', '', regex=True).astype(float))

Out[1363]:
        0             1
        A     B       A     B
0  409511  30.3  355529  30.3
1  332276  20.3   83684  20.3
2  138159  10.3  570834  10.3

【讨论】：

谢谢你，这已经很完美了，我会仔细阅读 regex=True 和 df.update 究竟做了什么