【问题标题】:split a list of dictionaries into multiple columns将字典列表拆分为多列
【发布时间】:2020-04-23 09:31:16
【问题描述】:

我有一个包含 30000 行和 5 列的数据框。这一列之一是字典和几个南的列表。我想将此列分成 3 个字段(legroom 到 In-FLight Enternatinment)并想提取评分

以下是供参考的示例

d = {'col1': [[{'rating': 5, 'ratingLabel': 'Legroom'}, {'rating': 5, 'ratingLabel': 'Seat comfort'}, {'rating': 5, 'ratingLabel': 'In-flight Entertainment'}],'Nan']}
df = pd.DataFrame(data=d)
df

【问题讨论】:

  • df['col1'].explode().apply(pd.Series)['rating'] 呢?

标签: python-3.x pandas list dictionary


【解决方案1】:

这是使用DataFrame.apply()pd.Series 的可能解决方案以及来自Splitting dictionary/list inside a Pandas Column into Separate Columns 的策略

import pandas as pd

d = {'col1': [[{'rating': 5, 'ratingLabel': 'Legroom'},
               {'rating': 5, 'ratingLabel': 'Seat comfort'},
               {'rating': 5, 'ratingLabel': 'In-flight Entertainment'}],
              [{'rating': 5, 'ratingLabel': 'Legroom'},
               {'rating': 5, 'ratingLabel': 'Seat comfort'},
               {'rating': 5, 'ratingLabel': 'In-flight Entertainment'}],
              'Nan']}
df = pd.DataFrame(data=d)
df

df_split = df['col1'].apply(pd.Series)
pd.concat([df,
           df_split[0].apply(pd.Series).rename(columns = {'rating':'legroom_rating',
                                                          'ratingLabel':'1'}),
           df_split[1].apply(pd.Series).rename(columns = {'rating':'seat_comfort_rating',
                                                         'ratingLabel':'2'}),
           df_split[2].apply(pd.Series).rename(columns = {'rating':'in_flight_entertainment_rating',
                                                         'ratingLabel':'3'})],
           axis = 1).drop(['col1','1','2','3',0],
                         axis = 1)

产生以下DataFrame

【讨论】:

    【解决方案2】:

    IIUC 这应该可以解决问题:

    df=df["col1"].apply(lambda x: pd.Series({el["ratingLabel"]: el["rating"] for el in x if isinstance(x, list)}))
    

    输出:

       Legroom  Seat comfort  In-flight Entertainment
    0      5.0           5.0                      5.0
    1      NaN           NaN                      NaN
    

    【讨论】:

      猜你喜欢
      • 2011-05-04
      • 2010-12-19
      • 1970-01-01
      • 2021-10-23
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-09-09
      • 1970-01-01
      相关资源
      最近更新 更多