【问题标题】:Python Dataframe Merge for matching用于匹配的 Python 数据框合并
【发布时间】:2021-09-25 18:45:04
【问题描述】:

我希望从两个数据帧中匹配对,这样没有两对是相同的,甚至单个值也不会在任何其他对中重复。匹配的关键是'cntr-size'和'carrier'。

例如:

import_v = pd.DataFrame({
    'cntr_no':[1,2,3,4,5,6,7,8],
    'cntr_size':[40,40,20,40,40,20,20,20],
    'carrier': ['MSK', 'MSK', 'MSC','MSK', 'MSK', 'MSC','CMA', 'MSK']
})

export_v = pd.DataFrame({
    'cntr_no':[9,10,11,12,13,14,15,16,17,18,19,20],
    'cntr_size':[40,40,20,40,40,20,20,20,40,40,20,20],
    'carrier': ['MSK', 'MSK', 'MSC','MSK', 'MSK', 'MSC','MSK', 'MSK','MSK', 'HLL','MSK', 'MSK']
})

如果我与 -

potential = pd.merge(import_v,export-v,on=['cntr_size','carrier'])

我得到的输出 -

[Out I Get][1]
cntr_no_x   cntr_size   carrier cntr_no_y
0   1   40  MSK 9
1   1   40  MSK 10
2   1   40  MSK 12
3   1   40  MSK 13
4   1   40  MSK 17
5   2   40  MSK 9
6   2   40  MSK 10
7   2   40  MSK 12
8   2   40  MSK 13
9   2   40  MSK 17
10  4   40  MSK 9
11  4   40  MSK 10 .... so on
12  4   40  MSK 12
13  4   40  MSK 13
14  4   40  MSK 17

我想要的输出 -

cntr_no_x   cntr_size   carrier cntr_no_y
0   1   40  MSK 9
1   2   40  MSK 10
2   4   40  MSK 12

所以cntr_no_xcntr_no_y 都应该是唯一的,不能重复自己

【问题讨论】:

    标签: python pandas merge matching


    【解决方案1】:

    解决方案来自-

    cntr_list_i=[]
    cntr_list_e=[]
    match_scale['status']=""
    for ind in match_scale.index:
        if (match_scale.loc[ind,'container_no_x']) not in cntr_list_i and (match_scale.loc[ind,'container_no_y']) not in cntr_list_e:
            match_scale.loc[ind,'status']=True
            cntr_list_i.append(match_scale.loc[ind,'container_no_x'])
            cntr_list_e.append(match_scale.loc[ind,'container_no_y'])
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2019-09-17
      • 1970-01-01
      • 1970-01-01
      • 2022-01-27
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多