Python如何根据多个变量创建排名？答案

【问题标题】：Python how to create rank based on multiple variables?Python如何根据多个变量创建排名？
【发布时间】：2021-06-30 19:16:43
【问题描述】：

我要做的是在他们父母的 ID 下按年龄为孩子创建一个排名：

ID	Relationship	Age	Rank
101	Parent	52	0
101	Spouse	50	0
101	Child	15	1
101	Child	12	2
201	Parent	40	0
201	Child	10	1

我做到了： df =df.sort_values(['ID','Relationship','Age']) 但不知道从那里去哪里。

【问题讨论】：

这是熊猫吗？那么你应该标记那个......
在组内，如果是孩子，分配你的等级

标签： python python-3.x pandas

【解决方案1】：

试试这个：

df["Rank"] = df[df["Relationship"]=="Child"].groupby("ID")["Age"].rank("dense", ascending=False).reindex(df.index).fillna(0)
>>> df
    ID Relationship  Age  Rank
0  101       Parent   52   0.0
1  101       Spouse   50   0.0
2  101        Child   15   1.0
3  101        Child   12   2.0
4  201       Parent   40   0.0
5  201        Child   10   1.0

您首先选择关系为子级的行，按“ID”对所有子级进行分组，对它们进行排序，将其分配回原始df 并用0 填充np.nan。

【讨论】：

非常感谢！
使用 where 和 na_option='top' 将保存 reindex 和 fillna: 与 df.where(df.Relationship == 'Child').groupby(df.ID).Age.rank('dense', ascending=False, na_option='top') - 1 一样
啊，但是对于 OP 提供的示例，我的代码（1.17 ms ± 12.6 µs）比你的（2.43 ms ± 27.4 µs）快得多