【发布时间】:2018-07-26 08:46:26
【问题描述】:
我正在使用以下 Pandas DataFrame index = groupedCrimes.index:
DatetimeIndex(['2014-06-30', '2014-07-31', '2014-08-31', '2014-09-30',
'2014-10-31', '2014-11-30', '2014-12-31', '2015-01-31',
'2015-02-28', '2015-03-31', '2015-04-30', '2015-05-31',
'2015-06-30', '2015-07-31', '2015-08-31', '2015-09-30',
'2015-10-31', '2015-11-30', '2015-12-31', '2016-01-31',
'2016-02-29', '2016-03-31', '2016-04-30', '2016-05-31',
'2016-06-30', '2016-07-31', '2016-08-31', '2016-09-30',
'2016-10-31', '2016-11-30', '2016-12-31', '2017-01-31',
'2017-02-28', '2017-03-31', '2017-04-30', '2017-05-31'],
dtype='datetime64[ns]', name='Month', freq='M')
我正在从 datetime64[ns] 转换它的类型,以便我可以在它上面使用 sklearns 线性回归。
#I change the dates to be integers, I am not sure this is the best way
groupedCrimes.index = pd.to_datetime(groupedCrimes.index)
groupedCrimes.index = (groupedCrimes.index - groupedCrimes.index.min()) / np.timedelta64(1,'D')
这会将其转换为以下内容:
[[0.00000000e+00]
[3.58796296e-13]
[7.17592593e-13]
[1.06481481e-12]
[1.42361111e-12]
[1.77083333e-12]
[2.12962963e-12]
[2.48842593e-12]
[2.81250000e-12]
[3.17129630e-12]
[3.51851852e-12]
[3.87731481e-12]
[4.22453704e-12]
[4.58333333e-12]
[4.94212963e-12]
[5.28935185e-12]
[5.64814815e-12]
[5.99537037e-12]
[6.35416667e-12]
[6.71296296e-12]
[7.04861111e-12]
[7.40740741e-12]
[7.75462963e-12]
[8.11342593e-12]
[8.46064815e-12]
[8.81944444e-12]
[9.17824074e-12]
[9.52546296e-12]
[9.88425926e-12]
[1.02314815e-11]
[1.05902778e-11]
[1.09490741e-11]
[1.12731481e-11]
[1.16319444e-11]
[1.19791667e-11]
[1.23379630e-11]]
然后例如我可以将这些值之一预测为日期:
[in] model.predict(3.58796296e-13)
[out] array([5990.81354452])
我该怎么做:
- A) 将这些数字转换回日期,以便我知道我是哪个日期 预测。
- B) 将未来的日期转换为这种格式,以便我可以预测 未来的日期?
我有更好的方法来转换和处理日期吗?
【问题讨论】:
标签: python pandas numpy scikit-learn linear-regression