【发布时间】:2019-10-25 12:20:54
【问题描述】:
我有一个包含以下列的数据集:
['sex', 'age', 'relationship_status]
“relationship_status”列中有一些 NaN 值,我想根据年龄和性别将它们替换为每个组中最常见的值。
我知道如何分组和计算值:
df2.groupby(['age','sex'])['relationship_status'].value_counts()
然后它返回:
age sex relationship_status
17.0 female Married with kids 1
18.0 female In relationship 5
Married 4
Single 4
Married with kids 2
male In relationship 9
Single 5
Married 4
Married with kids 4
Divorced 3
.
.
.
86.0 female In relationship 1
92.0 male Married 1
97.0 male In relationship 1
同样,我需要实现的是,每当“relationship_status”为空时,我需要程序根据人的年龄和性别将其替换为最常见的值。
谁能建议我该怎么做?
亲切的问候。
【问题讨论】: