【发布时间】:2018-01-22 15:20:02
【问题描述】:
我有一个 7 列的数据框,如下:
Bank_Acct Firstname | Bank_Acct Lastname | Bank_AcctNumber | Firstname | Lastname | ID | Date1 | Date2
B1 | Last1 | 123 | ABC | EFG | 12 | Somedate | Somedate
B2 | Last2 | 245 | ABC | EFG | 12 | Somedate | Somedate
B1 | Last1 | 123 | DEF | EFG | 12 | Somedate | Somedate
B3 | Last3 | 356 | ABC | GHI | 13 | Somedate | Somedate
B4 | Last4 | 478 | XYZ | FHJ | 13 | Somedate | Somedate
B5 | Last5 | 599 | XYZ | DFI | 13 | Somedate | Somedate
我想创建一个字典:
{ID1: (Count of Bank_Acct Firstname, Count of distinct Bank_Acct Lastname,
{Bank_AcctNumber1 : ItsCount, Bank_AcctNumber2 : ItsCount},
Count of distinct Firstname, Count of distinct Lastname),
ID2: (...), }
对于上面的例子:
{12: (2, 2, {123: 2, 245: 1}, 2, 1), 13 : (3, 3, {356: 1, 478: 1, 599: 1}, 2, 3)}
下面是代码:
cols = ['Bank First Name', 'Bank Last Name' 'Bank AcctNumber', 'First Name', 'Last Name']
df1 = df.groupby('ID').apply(lambda x: tuple(x[c].nunique() for c in cols))
d = df1.to_dict()
但上面的代码只给出了输出:
{12: (2, 2, 2, 2, 1), 13 : (3, 3, 3, 2, 3)}
给出不同银行账户号码的计数,而不是内部字典。
如何获取所需的字典?谢谢!!
【问题讨论】:
标签: python pandas dictionary dataframe group-by