【发布时间】:2020-10-01 11:27:27
【问题描述】:
我创建了一个数据透视表,显示每个客户每月的交易数量。
打印很乱,截图如下:
您可以看到customer_id 1(红色),他们于 2019 年 8 月首次购买(绿色)。这意味着客户 12 个月大。我从最近一个月倒数得到了 12 号。
如何以编程方式为所有客户提取此信息?
我想添加一个名为 tenure 的额外列,用于计算客户的年龄(以月为单位)。
我想我应该尝试向后/向前计算列数,直到找到第一个非“nan”数字。
但我不知道该怎么做。我在这里尝试了类似问题的一些建议。转置数据框并为每个客户使用Series.first_valid_index(),或混合使用df.isnull() 和idxmax(),但我无法使其工作。
任何帮助表示赞赏。我确信这很容易解决,但我迷路了。
我的样本df:
df.to_dict()
{'customer_id': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5, 5: 6, 6: 7, 7: 8, 8: 9, 9: 10},
'Jan 19': {0: nan,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: nan,
7: nan,
8: nan,
9: nan},
'Feb 19': {0: nan,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: nan,
7: nan,
8: 1.0,
9: nan},
'Mar 19': {0: nan,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: nan,
7: nan,
8: nan,
9: nan},
'Apr 19': {0: nan,
1: 1.0,
2: 1.0,
3: nan,
4: nan,
5: nan,
6: nan,
7: nan,
8: nan,
9: nan},
'May 19': {0: nan,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: 1.0,
7: nan,
8: nan,
9: nan},
'Jun 19': {0: nan,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: nan,
7: nan,
8: nan,
9: nan},
'Jul 19': {0: nan,
1: nan,
2: nan,
3: 1.0,
4: nan,
5: nan,
6: 2.0,
7: nan,
8: nan,
9: 2.0},
'Aug 19': {0: 1.0,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: nan,
7: nan,
8: 1.0,
9: nan},
'Sep 19': {0: 2.0,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: nan,
7: nan,
8: nan,
9: nan},
'Oct 19': {0: nan,
1: nan,
2: nan,
3: nan,
4: nan,
5: 1.0,
6: nan,
7: nan,
8: 1.0,
9: nan},
'Nov 19': {0: 1.0,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: nan,
7: 1.0,
8: nan,
9: nan},
'Dec 19': {0: nan,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: 1.0,
7: nan,
8: nan,
9: nan},
'Jan 20': {0: nan,
1: nan,
2: nan,
3: 1.0,
4: nan,
5: nan,
6: nan,
7: nan,
8: nan,
9: nan},
'Feb 20': {0: nan,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: nan,
7: nan,
8: nan,
9: nan},
'Mar 20': {0: nan,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: nan,
7: nan,
8: nan,
9: nan},
'Apr 20': {0: nan,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: nan,
7: nan,
8: nan,
9: nan},
'May 20': {0: nan,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: nan,
7: nan,
8: nan,
9: nan},
'Jun 20': {0: nan,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: nan,
7: nan,
8: nan,
9: nan},
'Jul 20': {0: nan,
1: nan,
2: nan,
3: nan,
4: nan,
5: nan,
6: nan,
7: nan,
8: nan,
9: nan},
'Aug 20': {0: nan,
1: nan,
2: nan,
3: nan,
4: 1.0,
5: nan,
6: nan,
7: nan,
8: nan,
9: nan}}
【问题讨论】: