【发布时间】:2021-03-31 04:10:53
【问题描述】:
我要数据框
id-input id-output Date Price Type
1 3 20/09/2020 100 ABC
2 1 20/09/2020 200 ABC
2 1 21/09/2020 300 ABC
1 3 21/09/2020 50 AD
1 2 21/09/2020 40 AD
我想得到这个输出:
id-inp-ABC id-out-ABC Date-ABC Price-ABC Type-ABC id-inp-AD id-out-AD Date-AD Price-AD Type-AD
1 3 20/09/2020 10 ABC 2 1 20/09/2020 10 AD
1' 3 20/09/2020 90 ABC Nan Nan Nan Nan Nan
2 1 20/09/2020 40 ABC 1 2 21/09/2020 40 AD
2' 1 20/09/2020 160 ABC Nan Nan Nan Nan Nan
2 1 21/09/2020 300 ABC Nan Nan Nan Nan Nan
我的想法是:
-将数据框按类型分为两个数据框 - 遍历两个数据帧并检查是否相同的 id-input == id-output
-检查价格是否相等,如果不拆分行并提取价格。 重命名列并合并它们。
grp = df.groupby('type')
transformed_df_list = []
for idx, frame in grp:
frame.reset_index(drop=True, inplace=True)
transformed_df_list.append(frame.copy())
ABC = pd.DataFrame([transformed_df_list[0])
AD = pd.DataFrame([transformed_df_list[1])
for i , row in ABC.iterrows():
for i, row1 in AD.iterrows():
if row['id-inp'] == row1['id-out']:2
row_df = pd.DataFrame([row1])
row_df= row_df.rename(columns={'id-inp': 'id-inp-AD', 'id-out':'id-out-AD' , 'Date':'Date-AD' ,'price':'price-AD'})
output = pd.merge(ABC.set_index('id-inp' , drop =False) ,row_df.set_index('id-out-AD' , drop =False), how='left' , left_on =['id-inp'] , right_on =['id-inp-AD' ])
但结果是 id-inp-AD id-out-AD Date-AD Price-AD Type-AD 部分中的 Nan,
而 row_df 只包含最后一行:
1 2 21/09/2020 40 A
我还希望迭代尊重顺序,并且输出数据框中的每个插入都按日期排序。
【问题讨论】: