【问题标题】:Merging pandas DataFrames with NaN for missing rows将 pandas DataFrames 与 NaN 合并以查找缺失的行
【发布时间】:2019-02-05 23:14:07
【问题描述】:

我想使用我的参考日历作为支架来填补我的主要数据中缺失的数据。为此,我想加入这两个数据框。

import pandas as pd
import numpy as np

d1 = { 'Year': [2019,2019,2019,2019,2019,2019],
        'Week': [1,2,3,5,5,6],
        'Part': ['A','A','A','A','B','B'],
        'Static': [20,20,20,20,40,40],
        'Value': [np.nan,10,np.nan,50,30,np.nan] }

d2 = { 'Year':[2019,2019,2019,2019,2019,2019,2019,2019,2019,2019],
        'Week':[1,2,3,4,5,6,7,8,9,10] }

df1 = pd.DataFrame(d1)
df2 = pd.DataFrame(d2)

预期的输出如下

    Year  Week Part  Static  Value
0   2019     1    A      20    NaN
1   2019     2    A      20   10.0
2   2019     3    A      20    NaN
3   2019     4    A      20    NaN
4   2019     5    A      20   50.0
5   2019     6    A      20    NaN
6   2019     7    A      20    NaN
7   2019     8    A      20    NaN
8   2019     9    A      20    NaN
9   2019    10    A      20    NaN
10  2019     1    B      40    NaN
11  2019     2    B      40    NaN
12  2019     3    B      40    NaN
13  2019     4    B      40    NaN
14  2019     5    B      40   30.0
15  2019     6    B      40    NaN
16  2019     7    B      40    NaN
17  2019     8    B      40    NaN
18  2019     9    B      40    NaN
19  2019    10    B      40    NaN

【问题讨论】:

    标签: python pandas dataframe join merge


    【解决方案1】:

    内嵌评论。

    # First, replicate `df2` for each unique Part.  
    df3 = (df2.assign(Key=1)
              .merge(pd.DataFrame({'Part': df1.Part.unique(), 'Key': 1}), on='Key')
              .drop('Key', 1))
    df3
    
        Year  Week Part
    0   2019     1    A
    1   2019     1    B
    2   2019     2    A
    3   2019     2    B
    4   2019     3    A
    5   2019     3    B
    6   2019     4    A
    7   2019     4    B
    8   2019     5    A
    9   2019     5    B
    10  2019     6    A
    11  2019     6    B
    12  2019     7    A
    13  2019     7    B
    14  2019     8    A
    15  2019     8    B
    16  2019     9    A
    17  2019     9    B
    18  2019    10    A
    19  2019    10    B
    
    # Next, perform left outer merge with `df1`.     
    df3.merge(df1, on=['Year', 'Week', 'Part'], how='left')
    
        Year  Week Part  Static  Value
    0   2019     1    A    20.0    NaN
    1   2019     1    B     NaN    NaN
    2   2019     2    A    20.0   10.0
    3   2019     2    B     NaN    NaN
    4   2019     3    A    20.0    NaN
    5   2019     3    B     NaN    NaN
    6   2019     4    A     NaN    NaN
    7   2019     4    B     NaN    NaN
    8   2019     5    A    20.0   50.0
    9   2019     5    B    40.0   30.0
    10  2019     6    A     NaN    NaN
    11  2019     6    B    40.0    NaN
    12  2019     7    A     NaN    NaN
    13  2019     7    B     NaN    NaN
    14  2019     8    A     NaN    NaN
    15  2019     8    B     NaN    NaN
    16  2019     9    A     NaN    NaN
    17  2019     9    B     NaN    NaN
    18  2019    10    A     NaN    NaN
    19  2019    10    B     NaN    NaN
    

    【讨论】:

      猜你喜欢
      • 2015-03-26
      • 2019-08-06
      • 2021-02-22
      • 2014-07-19
      • 2016-01-25
      • 1970-01-01
      • 1970-01-01
      • 2018-05-16
      • 2020-03-21
      相关资源
      最近更新 更多