是否有一行代码可以运行以根据 groupby 从一列中获取唯一值？

【问题标题】：Is there a line of code which I can run to get unique values from one column based on a groupby?是否有一行代码可以运行以根据 groupby 从一列中获取唯一值？
【发布时间】：2019-08-19 01:17:39
【问题描述】：

我对 df 进行了 groupby，以便根据国家代码对列进行分组。即：

DataGroup = df.groupby('COUNTRY_CODE').apply(lambda x: x.sort_values('Name'))
DataGroup

我已经设法根据所属的国家/地区代码对所有列进行分组。现在我想从每个国家代码分组的名称列中获取唯一名称，例如在“英国”中，如果该列包含名称：['John'，'Mary'，'Fatema'，'John' ]，如何仅在 df 列中显示唯一值？

希望这有意义吗？

【问题讨论】：

标签： python pandas group-by jupyter

【解决方案1】：

是的，您可以通过使用 drop_duplicates 并指定要检查的子集来实现此目的：DataGroup.drop_duplicates(subset=["COUNTRY_CODE", "Name"])

完整示例：

import pandas as pd
df = pd.DataFrame([("NO", "John"),
                   ("NO", "Jane"),
                   ("NO", "John"),
                   ("SE", "Ville")], columns=["COUNTRY_CODE","Name"])
DataGroup = df.groupby('COUNTRY_CODE').apply(lambda x: x.sort_values('Name'))
print(DataGroup)
DataGroup = DataGroup.drop_duplicates(subset=["COUNTRY_CODE", "Name"])
print(DataGroup)

Output:
               COUNTRY_CODE   Name
COUNTRY_CODE                      
NO           1           NO   Jane
             0           NO   John
             2           NO   John
SE           3           SE  Ville
               COUNTRY_CODE   Name
COUNTRY_CODE                      
NO           1           NO   Jane
             0           NO   John
SE           3           SE  Ville

【讨论】：

完美！感谢您的帮助:)