【发布时间】:2020-01-12 02:57:37
【问题描述】:
我有一个循环数据集(图片和下面的数据),我正在尝试获取 dy/dt 的导数。我想要每个周期(上升)的导数,而不仅仅是像我在这里所做的每个点。部分问题是样本的日期时间间隔不均匀。理想情况下,我希望每个周期的斜率从最小值到最大值。这是一个理想化的情况,我的整个数据集都会有噪声,并且每个周期的斜率不一定与这个建模集相同。
数据看起来是这样的,
time,y
12/15/18 01:10 AM,130352.146180556
12/16/18 01:45 AM,130355.219097222
12/17/18 01:47 AM,130358.223263889
12/18/18 02:15 AM,130361.281701389
12/19/18 03:15 AM,130364.406597222
12/20/18 03:25 AM,130352.427430556
12/21/18 03:27 AM,130355.431597222
12/22/18 05:18 AM,130358.663541667
12/23/18 06:44 AM,130361.842430556
12/24/18 07:19 AM,130364.915243056
12/25/18 07:33 AM,130352.944409722
12/26/18 07:50 AM,130355.979826389
12/27/18 09:13 AM,130359.153472222
12/28/18 11:53 AM,130362.4871875
12/29/18 01:23 PM,130365.673263889
12/30/18 02:17 PM,130353.785763889
12/31/18 02:23 PM,130356.798263889
01/01/19 04:41 PM,130360.085763889
01/02/19 05:01 PM,130363.128125
这是我的代码,
import pandas as pd
import numpy as np
import matplotlib.pyplot as plot
from datetime import date, timedelta
import datetime
df=pd.read_csv('saw_data.csv')
df['time']=pd.to_datetime(df['time'])
将日期时间设置为索引,
df.set_index(df['time'], inplace=True)
在这里我试图找到每个数据点之间的时间间隔。
df['Time_diff'] = pd.to_timedelta(df['time']-df['time'].shift()).dt.total_seconds().div(60)
#I believe units are minutes. For 'y' row 0 to 1, the diff is ~3 in about a day (86400 sec)
# so 3/86400 x 60 sec/min yields similar result to slope of 0.002 #/min.
'Time_diff' 来自 SO 帖子: calculate the time difference between two consecutive rows in pandas
这应该是导数,
df['slope']=np.gradient(df['y'],1)/df['Time_diff']
这是结果,
print(df.head())
time y Time_diff slope
time
2018-12-15 01:10:00 2018-12-15 01:10:00 130352.146181 NaN NaN
2018-12-16 01:45:00 2018-12-16 01:45:00 130355.219097 1475.0 0.002060
2018-12-17 01:47:00 2018-12-17 01:47:00 130358.223264 1442.0 0.002102
2018-12-18 02:15:00 2018-12-18 02:15:00 130361.281701 1468.0 0.002106
2018-12-19 03:15:00 2018-12-19 03:15:00 130364.406597 1500.0 -0.002951
这是数据集图片,
【问题讨论】:
-
你是说你想得到图中每条线的梯度吗?
-
基本上是的。