【发布时间】:2018-07-28 22:49:10
【问题描述】:
早上好,我是熊猫新手。我有一个名为 df 的 DataFrame,它有 4 列:Age、Survived、Pclass 和 Sex(PassengerID = index)。年龄字段的一部分 = NaN
Age Survived Pclass Sex
PassengerId
6 NaN 0 3 male
18 NaN 1 2 male
20 NaN 1 3 female
27 NaN 0 3 male
29 NaN 1 3 female
我想用交叉表中的数据替换 Age NaN。
mean_val = pd.crosstab(index=df["Survived"],columns[df['Sex'],df['Pclass']],values=df['Age'],aggfunc=np.mean)
产生以下内容:
Sex female male
Pclass 1 2 3 1 2 3
Survived
0 25.666667 36.000000 23.818182 44.581967 33.369048 27.255814
1 34.939024 28.080882 19.329787 36.248000 16.022000 22.274211
我想做的是这样的:
df['Age'] = mean_val[[df['Sex']][df['Pclass']][df['Survived']]]
我在哪里使用交叉表来查找特定乘客。结果如下所示:
Age Survived Pclass Sex
PassengerId
6 27.255814 0 3 male
18 16.022000 1 2 male
20 19.329787 1 3 female
27 27.255814 0 3 male
29 19.329787 1 3 female
提前感谢您的帮助!
【问题讨论】:
标签: python-3.x pandas crosstab