【问题标题】:Copy a row value to another column based on condition using Python pandas使用 Python pandas 根据条件将行值复制到另一列
【发布时间】:2019-10-31 17:57:26
【问题描述】:

我有一个可以使用下面给出的代码生成的数据框

data_file= pd.DataFrame({'person_id':[1,1,1,2,2,2,3,3,3],'ob.date': [np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan],
                     'observation': ['Age','interviewdate','marital_status','Age','interviewdate','marital_status','Age','interviewdate','marital_status'],
                     'answer': [21,'21/08/2017','Single',26,'11/03/2010','Single',41,'31/09/2012','Married'],
                     'visit.date': [np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan]
                     })

输入数据框如下所示

我想做的是从每个人对应的“answer”列中获取日期(interviewdate)值,并将其放在同一个人的“ob.date”和“visit.date”列中。

我尝试过滤数据框,但不确定如何继续。这仅适用于过滤后的行,但我希望将日期填充到原始或输入数据框中

df2 = data_file[(data_file.observation == 'interviewdate')]
df2.reset_index(inplace=True)
df3=data_file.merge(df2)
df3['ob.date']=df2['answer']
df3['visit.date']=df2['answer']

如何实现如下所示的输出?如您所见,每个人的采访数据都填写在“ob.date”和“visit.date”列中

【问题讨论】:

    标签: python python-3.x pandas list pandas-groupby


    【解决方案1】:

    过滤后创建Series,索引为person_id,并创建新列Series.map

    s = data_file[(data_file.observation == 'interviewdate')].set_index('person_id')['answer']
    print (s)
    person_id
    1    21/08/2017
    2    11/03/2010
    3    31/09/2012
    Name: answer, dtype: object
    
    data_file['ob.date'] = data_file['person_id'].map(s)
    data_file['visit.date'] = data_file['person_id'].map(s)
    print (data_file)
       person_id     ob.date     observation      answer  visit.date
    0          1  21/08/2017             Age          21  21/08/2017
    1          1  21/08/2017   interviewdate  21/08/2017  21/08/2017
    2          1  21/08/2017  marital_status      Single  21/08/2017
    3          2  11/03/2010             Age          26  11/03/2010
    4          2  11/03/2010   interviewdate  11/03/2010  11/03/2010
    5          2  11/03/2010  marital_status      Single  11/03/2010
    6          3  31/09/2012             Age          41  31/09/2012
    7          3  31/09/2012   interviewdate  31/09/2012  31/09/2012
    8          3  31/09/2012  marital_status     Married  31/09/2012
    

    如果可能更改数据格式 - 使用 DataFrame.pivot:

    df = data_file.pivot('person_id','observation','answer')
    print (df)
    observation Age interviewdate marital_status
    person_id                                   
    1            21    21/08/2017         Single
    2            26    11/03/2010         Single
    3            41    31/09/2012        Married
    

    【讨论】:

      猜你喜欢
      • 2023-03-17
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2022-01-25
      • 2017-11-21
      • 2020-11-02
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多