让我们首先获取一些看起来像您描述的随机数据:
>>> df = pd.DataFrame({
... 'datetime': pd.date_range(pd.Timestamp.today(), periods=2048, freq='300ms'),
... 'value': np.random.randint(0, 100, 2048) / 200 + 1
... })
如果您的 datetime 是字符串而不是实际的日期时间,您应该先转换它们:
>>> df['datetime'] = pd.to_datetime(df['datetime'])
然后您可以使用pd.Grouper 来实现您想要的。例如每秒:
>>> df.groupby(pd.Grouper(key='datetime', freq='1s'))['value'].mean()
datetime
2021-09-27 11:07:15 1.190000
2021-09-27 11:07:16 1.180000
2021-09-27 11:07:17 1.141250
2021-09-27 11:07:18 1.285000
2021-09-27 11:07:19 1.190000
...
2021-09-27 11:17:25 1.255000
2021-09-27 11:17:26 1.305000
2021-09-27 11:17:27 1.150000
2021-09-27 11:17:28 1.258333
2021-09-27 11:17:29 1.312500
Freq: S, Name: value, Length: 615, dtype: float64
每 5 秒:
>>> df.groupby(pd.Grouper(key='datetime', freq='5s'))['value'].mean()
datetime
2021-09-27 11:07:15 1.194286
2021-09-27 11:07:20 1.267647
2021-09-27 11:07:25 1.305000
2021-09-27 11:07:30 1.223125
2021-09-27 11:07:35 1.255294
...
2021-09-27 11:17:05 1.280882
2021-09-27 11:17:10 1.225294
2021-09-27 11:17:15 1.329687
2021-09-27 11:17:20 1.278235
2021-09-27 11:17:25 1.262353
Freq: 5S, Name: value, Length: 123, dtype: float64
等,见reference of frequency expressions。
另外请注意,您的平均值可能并不总是在每个单位时间内具有相同数量的值:
>>> df.groupby(pd.Grouper(key='datetime', freq='1s'))['value'].count()
datetime
2021-09-27 11:07:15 1
2021-09-27 11:07:16 3
2021-09-27 11:07:17 4
2021-09-27 11:07:18 3
2021-09-27 11:07:19 3
..
2021-09-27 11:17:25 3
2021-09-27 11:17:26 4
2021-09-27 11:17:27 3
2021-09-27 11:17:28 3
2021-09-27 11:17:29 4
Freq: S, Name: value, Length: 615, dtype: int64