【发布时间】:2020-10-11 12:35:27
【问题描述】:
我的 df:
import pandas as pd
import numpy as np
df = pd.DataFrame({'id':[1,1,1,2,2],
'time':['2020-01-01 12:00:15','2020-01-01 12:00:30','2020-01-01 12:00:45','2020-01-03 08:00:00','2020-01-03 08:00:15'],
'time1':['2020-01-01 12:00:00','2020-01-01 12:00:00','2020-01-01 12:00:00','2020-01-01 12:00:00','2020-01-01 12:00:00'],
'numb':[1,5,8,0,4]})
df['time'] = pd.to_datetime(df['time'])
df['time1'] = pd.to_datetime(df['time1'])
df['numb_diff'] = df['numb'] - df['numb'].shift()
输出:
id time time1 numb numb_diff
0 1 2020-01-01 12:00:15 2020-01-01 12:00:00 1 NaN
1 1 2020-01-01 12:00:30 2020-01-01 12:00:00 5 4.0
2 1 2020-01-01 12:00:45 2020-01-01 12:00:00 8 3.0
3 2 2020-01-03 08:00:00 2020-01-01 12:00:00 0 -8.0
4 2 2020-01-03 08:00:15 2020-01-01 12:00:00 4 4.0
现在我想将time1 设置为组中time 的最小值(id),只要id 在位置numb_diff 的第一个条目是
预期输出:
id time time1 numb numb_diff
0 1 2020-01-01 12:00:15 2020-01-01 12:00:00 1 NaN
1 1 2020-01-01 12:00:30 2020-01-01 12:00:00 5 4.0
2 1 2020-01-01 12:00:45 2020-01-01 12:00:00 8 3.0
3 2 2020-01-03 08:00:00 2020-01-03 08:00:00 0 -8.0 #Changing time1 to the min of time the group(id = 2)
4 2 2020-01-03 08:00:15 2020-01-03 08:00:00 4 4.0
【问题讨论】:
标签: python python-3.x pandas dataframe pandas-groupby