【问题标题】:Concatenating pandas DataFrames keeping only rows with matching values in a column?连接pandas DataFrames只保留列中具有匹配值的行?
【发布时间】:2016-10-17 01:19:38
【问题描述】:

我正在尝试“合并连接”两个 pandas 数据帧。基本上,我想堆叠两个 DataFrame,但只保留每个 DataFrame 中与另一个 DataFrame 中的值匹配的行。比如:

data1:

+---+------------+-----------+-------+
|   | first_name | last_name | class |
+---+------------+-----------+-------+
| 0 | Alex       | Anderson  |     1 |
| 1 | Amy        | Ackerman  |     2 |
| 2 | Allen      | Ali       |     3 |
| 3 | Alice      | Aoni      |     4 |
| 4 | Andrew     | Andrews   |     4 |
| 5 | Ayoung     | Atiches   |     5 |
+---+------------+-----------+-------+

data2:

+---+------------+-----------+-------+
|   | first_name | last_name | class |
+---+------------+-----------+-------+
| 0 | Billy      | Bonder    |     4 |
| 1 | Brian      | Black     |     5 |
| 2 | Bran       | Balwner   |     6 |
| 3 | Bryce      | Brice     |     7 |
| 4 | Betty      | Btisan    |     8 |
| 5 | Bruce      | Bronson   |     8 |
+---+------------+-----------+-------+

那么在data1data2 上执行此操作后生成的数据帧应如下所示:

result:

+---+------------+-----------+-------+
|   | first_name | last_name | class |
+---+------------+-----------+-------+
| 3 | Alice      | Aoni      |     4 |
| 4 | Andrew     | Andrews   |     4 |
| 5 | Ayoung     | Atiches   |     5 |
| 0 | Billy      | Bonder    |     4 |
| 1 | Brian      | Black     |     5 |
+---+------------+-----------+-------+

基本上,我正在尝试合并两个数据集,然后堆叠列。我可以想到几种方法来做到这一点,但它们都是 hack-y。我可以合并data1data2,然后将列堆叠起来,或者使用如下地图:

map1 = data1['subject_id'].map(lambda x: x in list(data2['subject_id']))
map2 = data2['subject_id'].map(lambda x: x in list(data1['subject_id']))
pd.concat([data1[map1], data2[map2]])

但是有没有更优雅的解决方案呢?

【问题讨论】:

    标签: python pandas dataframe


    【解决方案1】:

    这个怎么样?

    In [335]: cls = np.intersect1d(data1['class'], data2['class'])
    
    In [336]: cls
    Out[336]: array([4, 5], dtype=int64)
    
    In [337]: pd.concat([data1.ix[data1['class'].isin(cls)], data2.ix[data2['class'].isin(cls)]])
    Out[337]:
      first_name last_name  class
    3      Alice      Aoni      4
    4     Andrew   Andrews      4
    5     Ayoung   Atiches      5
    0      Billy    Bonder      4
    1      Brian     Black      5
    

    或:

    In [338]: data1.ix[data1['class'].isin(cls)].append(data2.ix[data2['class'].isin(cls)])
    Out[338]:
      first_name last_name  class
    3      Alice      Aoni      4
    4     Andrew   Andrews      4
    5     Ayoung   Atiches      5
    0      Billy    Bonder      4
    1      Brian     Black      5
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2018-01-29
      • 1970-01-01
      • 2019-10-18
      • 2020-05-13
      • 1970-01-01
      • 2015-06-10
      • 1970-01-01
      • 2019-02-26
      相关资源
      最近更新 更多