【问题标题】:Merging 3 different dataframes based on conditions根据条件合并 3 个不同的数据帧
【发布时间】:2017-03-09 17:17:18
【问题描述】:

如何组合三个数据框,如下所示?

前两者的主要关系必须基于ID1,因为它是两个数据帧之间的匹配关系。

第三个数据帧,Address2 必须匹配才能添加到哈希中

Df1:

Name1   Name2  Name3   Address    ID1     ID2    Own
Matt    John1  Jill     878 home   1       0     Deal
Matt    John2  Jack     879 home   2       1     Dael

DF2:

Name1   ID1   Address   Name4     Address2
Matt    1     878 home  face1     face\123
Matt    1     878 home  face2     face\345
Matt    1     878 home  face3     face\678    
Matt    2     879 home  head1     head\123
Matt    2     879 home  head2     head\345
Matt    2     879 home  head3     head\678

DF3:

Address2     Hash
face\123     abc123
face\345     cde321
face\678     efg123
head\123     123efg
head\345     efg321
head\678     acd321

我正在尝试将三个数据帧合并为一个,如下所示:

Name1   Name2   ID1 Address     Own    Name3    ID2 Name4   Address2    Hash
Matt    John1   1   878 home    Deal    Jill    0   face1   face\123    abc123
Matt    John1   1   878 home    Deal    Jill    0   face2   face\345    cde321
Matt    John1   1   878 home    Deal    Jill    0   face3   face\678    efg123
Matt    John2   2   879 home    Dael    Jack    1   head1   head\123    123efg
Matt    John2   2   879 home    Dael    Jack    1   head2   head\345    efg321
Matt    John2   2   879 home    Dael    Jack    1   head3   head\678    acd321

df1 和 df2 之间的关键是 Id1 在 df2 和 df3 之间,关键是 Address2

非常感谢您的帮助。

【问题讨论】:

  • 这里不是在列交叉点合并吗? df1.merge(df2).merge(df3)?

标签: python python-3.x pandas


【解决方案1】:

看看merge函数,一些例子可以在here找到。对于您的具体问题,请尝试以下操作:

combined_df = df1.merge(df2, on="Id1", how="inner").merge(df3, on="Adress2", how="inner")

【讨论】:

    【解决方案2】:

    我认为这会奏效。合并功能几乎可以在您要加入的列上为您完成。

    import numpy as np
    import pandas as pd
    
    data = np.array([['Name1','Name2','Name3','Address','ID1','ID2','Own'],
                     ['Matt','John1','Jill','878 home','1','0','Deal'],
                     ['Matt', 'John2', 'Jack', '879 home', '2', '1', 'Dael']])
    
    data2 = np.array([['Name1','ID1','Address','Name4','Address2'],
                     ['Matt', '1','878 home','face1',"face.123"],
                     ['Matt', '1','878 home', 'face2','face.345'],
                      ['Matt', '1','878 home', 'face3', 'face.678'],
                      ['Matt', '2', '879 home', 'head1', 'head.123'],
                      ['Matt', '2', '879 home', 'head2',  'head.345'],
                      ['Matt', '2', '879 home', 'head3', 'head.678']])
    #print(data)
    data3 = np.array([['Address2','Hash'],
                     ['face.123', 'abc123'],
                    ['face.345','cde321'],
                     ['face.678', 'efg123'],
                    ['head.123', '123efg'],
                    ['head.345', 'efg321'],
                    ['head.678', 'acd321']])
    
    df1 = pd.DataFrame(data=data[1:,:], columns=data[0,:])
    df2 = pd.DataFrame(data=data2[1:,:], columns=data2[0,:])
    df3 = pd.DataFrame(data=data3[1:,:], columns=data3[0,:])
    
    
    Cdf= pd.merge(df1,df2, on='ID1', how='inner')
    Ddf = pd.merge(Cdf,df3, on = 'Address2', how='inner')
    print(Ddf)
    

    【讨论】:

      【解决方案3】:

      根据您想要的输出,除了默认完成的列交叉合并之外,您似乎不需要 任何 规范。

      >>> df1.merge(df2).merge(df3)
      
        Name1  Name2 Name3  Address  ID1  ID2   Own  Name4  Address2    Hash
      0  Matt  John1  Jill  878 home    1    0  Deal  face1  face\123  abc123
      1  Matt  John1  Jill  878 home    1    0  Deal  face2  face\345  cde321
      2  Matt  John1  Jill  878 home    1    0  Deal  face3  face\678  efg123
      3  Matt  John2  Jack  879 home    2    1  Dael  head1  head\123  123efg
      4  Matt  John2  Jack  879 home    2    1  Dael  head2  head\345  efg321
      5  Matt  John2  Jack  879 home    2    1  Dael  head3  head\678  acd321
      

      像接受的答案那样指定要合并的单数列实际上会导致问题,因为您将有后缀列。

      >>> df1.merge(df2, on="ID1", how="inner").merge(df3, on="Address2", how="inner")
      
        Name1_x  Name2 Name3 Address_x  ID1  ID2   Own Name1_y Address_y  Name4  \
      0    Matt  John1  Jill   878home    1    0  Deal    Matt   878home  face1   
      1    Matt  John1  Jill   878home    1    0  Deal    Matt   878home  face2   
      2    Matt  John1  Jill   878home    1    0  Deal    Matt   878home  face3   
      3    Matt  John2  Jack   879home    2    1  Dael    Matt   879home  head1   
      4    Matt  John2  Jack   879home    2    1  Dael    Matt   879home  head2   
      5    Matt  John2  Jack   879home    2    1  Dael    Matt   879home  head3   
      
         Address2    Hash  
      0  face\123  abc123  
      1  face\345  cde321  
      2  face\678  efg123  
      3  head\123  123efg  
      4  head\345  efg321  
      5  head\678  acd321 
      

      【讨论】:

        猜你喜欢
        • 2017-12-30
        • 1970-01-01
        • 2023-01-08
        • 2019-08-10
        • 2021-07-20
        • 2022-01-20
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多