【发布时间】:2018-05-05 15:33:16
【问题描述】:
尝试比较“Cntr No”的 df1 和 df2,并且 df2 [“人工成本”、“材料成本”、“估计货币金额”] 的任何一列中的值必须与 df1 的总计相匹配。
例如,df1 OOLU 3868088 与 df2 OOLU 3868088 匹配,并且 df1 “28”的总值与 df2 的“劳动力成本”值“28”匹配。
df:
df1 = pd.DataFrame({'Cntr No': ['OOLU 3868088','OOLU 3868088','OOLU 3868088','TRIU 0625840','TRIU 0625840','TRIU 0625840','TRIU 1234567','OOLU 6232016','OOLU 0981231','OOLU 1212444'],
'Total': [12,28,48,119,82.5,11.0,18.0,11.0,13.0,10.0]})
df2 = pd.DataFrame({'Cntr No': ['OOLU 3868088','OOLU 3868088','OOLU 3868088','TRIU 0625840','TRIU 0625840','TRIU 0625840','TRIU 1234567'],
'Labour Cost': [0.0,0.0,28.0,0.0,54.0,0.0,0.0],
'Material Cost':[0.00,12.0,58.91,82.5,54.0,0.0,16.0],
'Amount in Estimate Currency':[48.00,12.00,87.81,82.5,119.0,12.0,16.0]})
预期输出:
Cntr No Total Tally_with_df2
0 OOLU 3868088 12.0 Yes
1 OOLU 3868088 28.0 Yes
2 OOLU 3868088 48.0 Yes
3 TRIU 0625840 119.0 Yes
4 TRIU 0625840 82.5 Yes
5 TRIU 0625840 11.0 No
6 TRIU 1234567 18.0 No
使用的代码:这是我尝试过但无法达到我的要求的以下代码
cols = ['Labour Cost', 'Material Cost', 'Amount in Estimate Currency']
d = {k: set(v.values()) for k, v in \
df_co.set_index('Cntr No')[cols].to_dict(orient='index').items()}
df['Tally'] = [j in d.get(i, set()) for i, j in zip(df['Cntr No'], df['Total'])]
df['Tally'] = df['Tally'].map({True: 'Yes', False: 'No'})
df1:
Cntr No object
Serviced By object
Location object
WO No object
WASH - CHEMICAL float64
PTI - CHILL float64
WASHING CONTAINER AGENT float64
WASH - CHEMICAL AGENT float64
WASHING CONTAINER -AGENT float64
BUNDLING/UNBUNDLING OF FR float64
PTI - AUTO float64
PTI float64
Struct Repair - Labour float64
Struct Repair - Material float64
Machy Repair - Labour float64
Total float64
Vendor object
Sz object
Ty object
CO object
WO Date object
WO ID object
df2:
Cntr No object
Equipment Size/type Group Code object
Labour Cost float64
Material Cost float64
Amount in Estimate Currency float64
Remarks object
【问题讨论】: