将不均匀字典列表转换为熊猫数据框答案

【问题标题】：convert list of uneven dictionaries to pandas dataframe将不均匀字典列表转换为熊猫数据框
【发布时间】：2020-01-04 17:56:58
【问题描述】：

鉴于此字典列表

[{'Empire:FourKingdoms:': {'US': '208', 'FR': '96', 'DE': '42', 'GB': '149'}}, 
 {'BigFarmMobileHarvest:': {'US': '211', 'FR': '101', 'DE': '64', 'GB': '261'}}, 
 {'AgeofLords:': {'US': '00', 'JP': '00', 'FR': '00', 'DE': '00', 'GB': '00'}}, 
 {'BattlePiratesHQ:': {'US': '00', 'JP': '00', 'FR': '00', 'DE': '00', 'GB': '00'}},
 {'CallofWar:': {'US': '00', 'JP': '00', 'FR': '00', 'DE': '00', 'GB': '00'}}, 
 {'Empire:AgeofKnights:': {'US': '00', 'JP': '00', 'FR': '00', 'DE': '00', 'GB': '00'}}, 
 {'Empire:MillenniumWars:': {'US': '00', 'JP': '00', 'FR': '00', 'DE': '00', 'GB': '00'}}, 
 {'eRepublik:': {'US': '00', 'JP': '00', 'FR': '00', 'DE': '00', 'GB': '00'}}, 
 {'GameofEmperors:': {'US': '00', 'JP': '00', 'FR': '00', 'DE': '00', 'GB': '00'}}, 
 {'GameofTrenches:': {'US': '00', 'JP': '00', 'FR': '00', 'DE': '00', 'GB': '00'}}]

还有这个行名列表：

['Name', 'country', '30/08/2019']

我怎么会得到这个 DataFrame：

        Name:    Empire:FourKingdoms  BigFarmMobileHarvest  AgeofLords     ...
0    Country:    US  FR  DE  GB       US  FR  DE  GB        US JP FR DE GB
1 30/08/2019:    208 96  42  149      211 101 64  261       00 00 00 00 00 ...

每个 Country 和 30/08/2019 值在 DataFrame 中都有自己的单元格。但它们应该放在每个游戏之下。不确定当字典长度不同时这是否可能。

我最初的想法是从列表中取出字典，以所需的方式转换为 DataFrame（以某种方式），然后再添加行名。我在想一些转调必须找到地方。

另一个想法是制作字典键列名并从那里开始。

最终，这必须打印到 Excel 表中。

我查看了以前的questions，但不确定它是否适用于我的情况。

【问题讨论】：

标签： python-3.x pandas list dataframe

【解决方案1】：

你可以这样做：

# transform your dictionary to be flat
# so entries like 'Empire:FourKingdoms:'
# become values of key 'Name'
l2= list()
for d in l:
    for name, dct in d.items():
        dct= dict(dct)
        dct['Name']= name
        l2.append(dct)

# create a dataframe from these dictionaries
df= pd.DataFrame(l2)
# I saw you had a date in your example, so I guess you want to
# add rows from time to time
df['Date']= '30/08/2019'

# create an index based on Date and Name (the columns the data
# is aligned to) then unstack it to make Name the second
# level of the column index, swap the two levels, so Name
# is on top and finally resort the index, so the countries
# are grouped below the Name (instead of still having everything
# sorted for country so the Names appear for each country
# separately)
df.set_index(['Date', 'Name']).unstack(1).swaplevel(axis='columns').sort_index(axis=1)

结果如下：

Out[1]: 
Name       AgeofLords:                 BattlePiratesHQ:          ... GameofTrenches:         eRepublik:                
                    DE  FR  GB  JP  US               DE  FR  GB  ...              GB  JP  US         DE  FR  GB  JP  US
Date                                                             ...                                                   
30/08/2019          00  00  00  00  00               00  00  00  ...              00  00  00         00  00  00  00  00

【讨论】：

天哪，它确实有效。我不认为这是可能的。
这是一个非常棘手的问题。我也有类似的想法。
熊猫无所不能。几乎:-)
@Vishnudev 我有更多他们来自哪里:) Pandas 真的很棒，有时也很复杂。
既然你已经回答了这个问题@jottbe，请注意我的下一个问题，关于将结果写入现有的 excel 表stackoverflow.com/questions/57745818/…