【问题标题】:regarding the error message of nvalid comparison between dtype=datetime64[ns] and date关于 dtype=datetime64[ns] 和 date 之间的 nvalid 比较的错误信息
【发布时间】:2021-07-06 09:22:39
【问题描述】:

我试图运行以下两个段,这是databricks tutorial 的一部分。

from sklearn.metrics import mean_squared_error, mean_absolute_error
from math import sqrt
from datetime import date

# get historical actuals & predictions for comparison
actuals_pd = history_pd[ history_pd['ds'] < date(2018, 1, 1) ]['y']  # line 1
predicted_pd = forecast_pd[ forecast_pd['ds'] < date(2018, 1, 1) ]['yhat']  # line 2

但是,我从predicted_pd = forecast_pd[ forecast_pd['ds'] &lt; date(2018, 1, 1) ]['yhat'] 得到了TypeError: Invalid comparison between dtype=datetime64[ns] and date 的错误信息。。上一行的那个,看起来很相似,没有这个错误。我还打印了predicted_pdactuals_pd的类型供参考。

TypeError                                 Traceback (most recent call last)
<ipython-input-15-748394f8994f> in <module>
----> 1 predicted_pd = forecast_pd[ forecast_pd['ds'] < date(2018, 1, 1) ]['yhat']

~\Anaconda3\envs\sparkenv\lib\site-packages\pandas\core\ops\common.py in new_method(self, other)
     63         other = item_from_zerodim(other)
     64 
---> 65         return method(self, other)
     66 
     67     return new_method

~\Anaconda3\envs\sparkenv\lib\site-packages\pandas\core\ops\__init__.py in wrapper(self, other)
    368         rvalues = extract_array(other, extract_numpy=True)
    369 
--> 370         res_values = comparison_op(lvalues, rvalues, op)
    371 
    372         return self._construct_result(res_values, name=res_name)

~\Anaconda3\envs\sparkenv\lib\site-packages\pandas\core\ops\array_ops.py in comparison_op(left, right, op)
    228     if should_extension_dispatch(lvalues, rvalues):
    229         # Call the method on lvalues
--> 230         res_values = op(lvalues, rvalues)
    231 
    232     elif is_scalar(rvalues) and isna(rvalues):

~\Anaconda3\envs\sparkenv\lib\site-packages\pandas\core\ops\common.py in new_method(self, other)
     63         other = item_from_zerodim(other)
     64 
---> 65         return method(self, other)
     66 
     67     return new_method

~\Anaconda3\envs\sparkenv\lib\site-packages\pandas\core\arrays\datetimelike.py in wrapper(self, other)
    116             other = _validate_comparison_value(self, other)
    117         except InvalidComparison:
--> 118             return invalid_comparison(self, other, op)
    119 
    120         dtype = getattr(other, "dtype", None)

~\Anaconda3\envs\sparkenv\lib\site-packages\pandas\core\ops\invalid.py in invalid_comparison(left, right, op)
     32     else:
     33         typ = type(right).__name__
---> 34         raise TypeError(f"Invalid comparison between dtype={left.dtype} and {typ}")
     35     return res_values
     36 

TypeError: Invalid comparison between dtype=datetime64[ns] and date

【问题讨论】:

  • 你可以试试:forecast_pd[ forecast_pd['ds'] &lt; pd.to_datetime('2018-01-01')
  • @Erfan,感谢您的回答。有用。您想解释一下为什么这个修改有效,以及为什么 actuals_pd = history_pd[ history_pd['ds']

标签: pandas scikit-learn databricks python-datetime python-dataclasses


【解决方案1】:

Pandas 日期默认为datetime64[ns]。所以你不想将它们与datetime.date 对象进行比较。相反,您可以只使用日期字符串,pandas 将正确处理比较。此外,如果您使用loc 指定行和列,您将获得比示例中更简洁的语法。

datestr = '2018-01-01'
actuals_pd = history_pd.loc[history_pd['ds'] < datestr, 'y']  # line 1
predicted_pd = forecast_pd.loc[forecast_pd['ds'] < datestr, 'yhat']  # line 2

【讨论】:

    猜你喜欢
    • 2022-01-02
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2023-01-09
    • 2020-09-04
    • 2019-05-22
    • 2019-01-20
    相关资源
    最近更新 更多