【问题标题】：Creating python nested dictionary from data frame columns and saving the result in a new DataFrame从数据框列创建 python 嵌套字典并将结果保存在新的 DataFrame 中
【发布时间】：2019-05-01 01:52:21
【问题描述】：

我正在努力从 DataFrame 中获取 3 列，并从中创建一个字典并将其保存在一个新的 DataFrame 中。

这是原始的DataFrame：

part_id    name exp_no  key value
1       Clips   58868   name    Charlie
1       Clips   58870   phone   123456789
1       Clips   58845   region  Ontario
2       Clips   58821   city    London
2       Clips   58832   country Chili
3       Nails   58869   postalcode  123456
3       Nails   58830   colour  red

我正在使用 pandas，但没有取得多大成功，非常感谢一些帮助

创建新的 DataFrame 并在其中仅获取唯一数据

new_file = pd.DataFrame()
new_file = data_unique
for part_id in data.iterrows():
  if part_id in new_file:

TypeError: 'Series' 对象是可变的，因此它们不能被散列此错误表明 DataFrame 不是此类程序的正确选择。还有什么方法更合适？

这就是最终结果的外观 - 每个零件编号一条记录

part_id name    exp_no  key value   exp_key_value
1       clips   58868   name    Charlie {"attributes": 
[{"exp_no":"58868", "key":"name", "value":"Charlie"}, 
{"exp_no":"58870", "key":"phone", "value":"123456789"}, 
{"exp_no":"58845", "key":"region", "value":"Ontario"} ] } 
2       clips   58821   city    London  {"attributes": 
[{"exp_no":"58821", "key":"city", "value":"London"}, 
{"exp_no":"58832", "key":"country", "value":"Chili"} ] }
3   nails   58869   postal  12345   {"attributes": 
[{"exp_no":"58869", "key":"postal", "value":"12345"}, 
{"exp_no":"58830", "key":"colour", "value":"red"} ] }

【问题讨论】：

使用 drop_duplicates 并给出要删除重复的子集。

标签： python pandas

【解决方案1】：

试试这个：

   df = pd.DataFrame({"part_id":[1,1,1,2,2,3,3],
               "name":['Clips', 'Clips' , 'Clips' , 'Clips', 'Clips', 'Nails', 'Nails'], 
               "exp_no": [58868, 58869, 58860, 58861, 588682, 58863, 58864], 
               "key":['name', 'phone', 'region','city', 'country', 'postalcode', 'colour'], 
              "value": ['Charlie', '123456789', 'Ontario', 'London','Chili', '123456', 'red']})

   # create the dictonary for each row
   def create_dic(row):
     dict ={}
     dict['exp_no'] = row['exp_no']
     dict['key'] = row['exp_no']
     dict['value'] = row['exp_no']
     return dict

   df['exp_key_value'] = df.apply(create_dic, axis=1)

   df_dropped = df.drop_duplicates(subset= 'part_id',  keep ='first')

   final =[]
   dict = {}

   for i, part in enumerate(df['part_id'].unique()):
     dict['attribute'] =df[df['part_id']==part]['exp_key_value'].tolist()
     final.append(dict)

   df_dropped['exp_key_value'] = final

【讨论】：

谢谢你，凯沙夫。刚刚得到 'TypeError: 'type' object does not support item assignment' - dict['attribute'] =df[df['part_id']==part]['exp_key_value'].tolist()
运行与上述完全相同的代码时出现此错误？
是的，在 Python 3 上的 Jupyter Notebook 上运行它
对不起，我错过了初始化dict = {}的一行代码。在上面进行了编辑！