【发布时间】:2020-06-22 06:20:34
【问题描述】:
我正在尝试编写代码以使用 python pandas 库根据值范围对数据集(来自 CSV)进行分类。可以使用聚合函数。但我在使用聚合函数时遇到了困难。
+-------------+-------------+-------------+-------------+-------------+
|Name | Age |Region |Telephone |Address |
+-------------+-------------+-------------+-------------+-------------+
| | | | | |
我可以开发以下代码。
import pandas as pd
data_frame = pd.read_csv('5000 Records.csv')
data_frame['age_range'] = pd.cut(data_frame['Age in Yrs.'],
bins=[-float('inf'),30,50,float('inf')],
labels=['above', 'in between', 'below'])
data_frame = data_frame.groupby(['Region','age_range']).agg(
{
'age_range': "count"
}
)
print(data_frame)
但结果如下
age_range
Region age_range
Midwest above 312
in between 695
below 390
Northeast above 201
in between 421
below 219
South above 435
in between 983
below 452
West above 211
in between 443
below 238
但要求是得到输出为:
+-------------+-------------+-------------+-------------+
|Region | above |in between |below |
+-------------+-------------+-------------+-------------+
| | | | |
有人可以帮我做这件事吗?提前谢谢!
【问题讨论】:
-
嗨 UpaniK,您能展示一下您的分组前数据样本吗?
-
年龄列填充了 18-60 之间的浮点值
标签: python pandas aggregation