【问题标题】:Pandas split ages by group熊猫按组划分年龄
【发布时间】:2020-10-27 08:10:06
【问题描述】:

我对 pandas 很陌生,需要一些帮助。我有一个年龄列,需要将这些分组: 年轻人:年龄≤30 中年人:30

def get_num_people_by_age_category(dataframe):
    young, middle_aged, old = (0, 0, 0)
    dataframe["age"] = pd.cut(x=dataframe['age'], bins=[30,31,60,61], labels=["young","middle_aged","old"])
    return young, middle_aged, old
ages = get_num_people_by_age_category(dataframe) 
print(dataframe)

【问题讨论】:

  • 你想把return young, middle_aged, old改成return dataframe吗?
  • 正如@jezrael 所说,你永远不会对young, middle_aged, old = (0, 0, 0)做任何事
  • 嗨,@jezrael - 尝试但收到相同的错误消息...顺便说一句,如果我用收入数据(这是我使用的数据帧)替换数据帧,我会收到另一条错误消息,即 => TypeError: 'int' 和 'str' 的实例之间不支持'
  • @babavyna - 可以将dataframe["age"] = pd.cut(x=dataframe['age'], bins=[30,31,60,61], labels=["young","middle_aged","old"]) 更改为dataframe["age"] = pd.cut(x=pd.to_numeric(dataframe['age'], errors='coerce'), bins=[30,31,60,61], labels=["young","middle_aged","old"]) 吗?

标签: pandas split grouping


【解决方案1】:

下面的代码使用pd.cut() 获取年龄组。

# Import libraries
import pandas as pd

# Create DataFrame
df = pd.DataFrame({
    'age': [1,20,30,31,50,60,61,80,90] #np.random.randint(1,100,50)
})

# Function: Copy-pasted from question and modified
def get_num_people_by_age_category(df):
    df["age_group"] = pd.cut(x=df['age'], bins=[0,30,60,100], labels=["young","middle_aged","old"])
    return df

# Call function
df = get_num_people_by_age_category(df)

输出

print(df)

   age    age_group
0    1        young
1   20        young
2   30        young
3   31  middle_aged
4   50  middle_aged
5   60  middle_aged
6   61          old
7   80          old
8   90          old

【讨论】:

    猜你喜欢
    • 2012-02-04
    • 2016-06-24
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2022-01-13
    相关资源
    最近更新 更多