【问题标题】:Pandas Long to Wide for Categorical Dataframe分类数据框的 Pandas 从长到宽
【发布时间】:2021-09-27 02:49:49
【问题描述】:

通常当我们想在 Pandas 中将数据帧从长到宽转换时,我们使用 pivotpivot_tableunstackgroupby,但是当有可聚合元素时效果很好。我们如何以相同的方式转换分类数据框?

例子:

d = {'Fruit':['Apple', 'Apple', 'Apple', 'Kiwi'], 
'Color1':['Red', 'Yellow', 'Red', 'Green'],
'Color2':['Red', 'Red', 'Green', 'Brown'],'Color3':[np.nan,np.nan,'Red',np.nan]}

pd.DataFrame(d)

    Fruit   Color1  Color2  Color3
0   Apple   Red     Red     NaN
1   Apple   Yellow  Red     NaN
2   Apple   Red     Green   Red
3   Kiwi    Green   Brown   NaN

应该变成这样:

d = {'Fruit':['Apple','Kiwi'], 
     'Color1':['Red','Green'],
     'Color1_1':['Yellow',np.nan],
     'Color1_2':['Red',np.nan],
     'Color2':['Red', 'Brown'],
     'Color2_1':['Red',np.nan],
     'Color2_2':['Green',np.nan],
     'Color3':[np.nan,np.nan],
     'Color3_1':[np.nan,np.nan],
     'Color3_2':['Red',np.nan]
    }

pd.DataFrame(d)

    Fruit   Color1  Color1_1    Color1_2    Color2  Color2_1    Color2_2    Color3  Color3_1    Color3_2
0   Apple   Red     Yellow      Red         Red     Red         Green       NaN     NaN         Red
1   Kiwi    Green   NaN         NaN         Brown   NaN         NaN         NaN     NaN         NaN

【问题讨论】:

    标签: python pandas


    【解决方案1】:

    尝试cumcountgroupby 来获取计数,然后将pivot 作为列,然后设置列名,使用:

    df = df.assign(idx=df.groupby('Fruit').cumcount()).pivot(index='Fruit',columns='idx')
    print(df.set_axis([f'{x}_{y}' if y != 0 else x for x, y in df.columns], axis=1).reset_index())
    

    输出:

       Fruit Color1 Color1_1 Color1_2 Color2 Color2_1 Color2_2 Color3 Color3_1 Color3_2
    0  Apple    Red   Yellow      Red    Red      Red    Green    NaN      NaN      Red
    1   Kiwi  Green      NaN      NaN  Brown      NaN      NaN    NaN      NaN      NaN
    

    与您的输出完全匹配。

    【讨论】:

      猜你喜欢
      • 2017-05-05
      • 2021-05-12
      • 1970-01-01
      • 2016-10-05
      • 1970-01-01
      • 1970-01-01
      • 2022-10-23
      相关资源
      最近更新 更多