【发布时间】:2021-06-07 11:16:30
【问题描述】:
我有以下数据框 -
df = pd.DataFrame({
'ID': [1, 2, 2, 3, 3, 3, 4],
'Prior': ['a', 'b', 'c', 'd', 'e', 'f', 'g'],
'Current': ['a1', 'c', 'c1', 'e', 'f', 'f1', 'g1'],
'Date': ['1/1/2019', '5/1/2019', '10/2/2019', '15/3/2019', '6/5/2019',
'7/9/2019', '16/11/2019']
})
这是我想要的输出 -
desired_df = pd.DataFrame({
'ID': [1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4],
'Prior_Current': ['a', 'a1', 'b', 'c', 'c1', 'd', 'e', 'f', 'f1', 'g',
'g1'],
'Start_Date': ['', '1/1/2019', '', '5/1/2019', '10/2/2019', '', '15/3/2019',
'6/5/2019', '7/9/2019', '', '16/11/2019'],
'End_Date': ['1/1/2019', '', '5/1/2019', '10/2/2019', '', '15/3/2019',
'6/5/2019', '7/9/2019', '', '16/11/2019', '']
})
我尝试了以下 -
keys = ['Prior', 'Current']
df2 = (
pd.melt(df, id_vars='ID', value_vars=keys, value_name='Prior_Current')
.merge(df[['ID', 'Date']], how='left', on='ID')
)
df2['Start_Date'] = np.where(df2['variable'] == 'Prior', df2['Date'], '')
df2['End_Date'] = np.where(df2['variable'] == 'Current', df2['Date'], '')
df2.sort_values(['ID'], ascending=True, inplace=True)
但这似乎不起作用。请帮忙。
【问题讨论】:
-
代码格式化使用
`而不是' -
也请帮忙解决一下
标签: python pandas dataframe multiple-columns melt