您可以使用apply 的自定义函数:
df['new'] = df.apply(lambda x : np.where(pd.date_range(x['Start'], x['End']).weekday < 5, 16, 24).sum(), axis=1)
print (df)
Start End new
0 2017-02-03 2017-03-15 752
1 2017-02-05 2017-03-16 728
2 2017-02-06 2017-03-17 720
3 2017-02-10 2017-03-18 680
同理:
def f(x):
b = pd.date_range(x['Start'], x['End']).weekday
return np.where(b < 5, 16, 24).sum()
df['new'] = df.apply(f, axis=1)
print (df)
Start End new
0 2017-02-03 2017-03-15 752
1 2017-02-05 2017-03-16 728
2 2017-02-06 2017-03-17 720
3 2017-02-10 2017-03-18 680
另一种解决方案,但我认为它更复杂:
#reshape df
df1 = df.stack().reset_index()
df1.columns = ['i','c','date']
#groupby by index and resample to days, forward fill NaNs
df1 = df1.set_index('date').groupby('i').resample('D').ffill()
.reset_index(level=0, drop=True).reset_index()
#get hours
df1['tot'] = np.where(df1['date'].dt.weekday < 5, 16, 24)
#sum by index
s = df1.groupby('i')['tot'].sum()
#join to original
df = df.join(s)
print (df.head(10))
Start End tot
0 2017-02-03 2017-03-15 752
1 2017-02-05 2017-03-16 728
2 2017-02-06 2017-03-17 720
3 2017-02-10 2017-03-18 680
时间安排:
df = pd.concat([df]*100).reset_index(drop=True)
print (df)
def f(df):
df1 = df.stack().reset_index()
df1.columns = ['i','c','date']
df1 = df1.set_index('date').groupby('i').resample('D').ffill().reset_index(level=0, drop=True).reset_index()
df1['tot'] = np.where(df1['date'].dt.weekday < 5, 16, 24)
s = df1.groupby('i')['tot'].sum()
return df.join(s)
print (f(df))
mapping = {i:16 if i<5 else 24 for i in range(7)}
In [190]: %timeit (f(df))
1 loop, best of 3: 482 ms per loop
#MaxU solution
In [191]: %timeit df['oncall_hours'] = df.apply(lambda x: pd.date_range(x['Start'], x['End']).to_series().dt.weekday.map(mapping).sum(), axis=1)
1 loop, best of 3: 531 ms per loop
In [192]: %timeit df['new'] = df.apply(lambda x : np.where(pd.date_range(x['Start'], x['End']).weekday < 5, 16, 24).sum(), axis=1)
10 loops, best of 3: 166 ms per loop