错误在一段时间内创建假人时，Pandas答案

【问题标题】：Error When creating dummies for a range of time period, Pandas错误在一段时间内创建假人时，Pandas
【发布时间】：2017-07-05 18:15:31
【问题描述】：

我有一个熊猫数据框，其时间列从00:00:00 到23:00:00，类型为timedelta64[ns]。我想根据时间范围使用1 和0 创建一个新列df['m_hours']。例如，当时间范围为01:00:00到04:00:00时，应为1，其余为0。我尝试了以下代码，

df['m_hours'] = np.where(df['hour']>= '01:00:00'& df['hour']<= '04:00:00', '1', '0')

我收到一条错误消息， TypeError: cannot compare a dtyped [timedelta64[ns]] array with a scalar of type [bool] 然后我试了，

df['m_hours'] = np.where(df[(df['hour']>= '01:00:00')& (df['hour']<= '04:00:00'), '1', '0']

然后我得到一个错误， SyntaxError: unexpected EOF while parsing

This 的帖子看起来很有希望，但对我的情况没有多大帮助。有没有其他方法可以在一定时间内创建假人？我将不胜感激。

谢谢！

根据要求，以下编辑是示例 df 和数据类型

    date   hour                   avg_price  
2016-05-01 00:00:00                  69.5                  
2016-05-01 01:00:00                  67.0                  
2016-05-01 02:00:00                  66.0                  
2016-05-01 03:00:00                  66.0                
2016-05-01 04:00:00                  65.0                  
2016-05-01 05:00:00                  65.0                  
2016-05-01 06:00:00                  65.5                 
2016-05-01 07:00:00                  69.0                
2016-05-01 08:00:00                  72.0                  
2016-05-01 09:00:00                  77.0                 
2016-05-01 10:00:00                  80.0                  
2016-05-01 11:00:00                  81.0                 
2016-05-01 12:00:00                  82.0                  
2016-05-01 13:00:00                  85.0                  
2016-05-01 14:00:00                  85.0                  
2016-05-01 15:00:00                  85.0                  
2016-05-01 16:00:00                  88.0                  
2016-05-01 17:00:00                  87.0                  
2016-05-01 18:00:00                  86.0                  
2016-05-01 19:00:00                  81.0                  
2016-05-01 20:00:00                  79.0                  
2016-05-01 21:00:00                  78.0                  
2016-05-01 22:00:00                  76.0                  
2016-05-01 23:00:00                  74.0                  
2016-05-02 00:00:00                  73.0                   
2016-05-02 01:00:00                  68.0                  
2016-05-02 02:00:00                  66.0                   
2016-05-02 03:00:00                  66.0                   
2016-05-02 04:00:00                  64.0                  
2016-05-02 05:00:00                  67.0

数据类型有：

date                    datetime64[ns]
hour                    timedelta64[ns]
avg_price               float64

【问题讨论】：

df['m_hours'] = ((df['hour'] >= '01:00:00') & (df['hour'] <= '04:00:00')).astype(int) 适合你吗？
@lanery，我试过了，它没有用，我得到了使用Try using .loc[row_indexer,col_indexer] = value instead 的建议，然后我转换为pd.to_timedelta 仍然得到问题中提到的错误。
你能给个样本df吗。
我认为，如果您提供数据框样本（仅前 5-10 行）并将其添加到您的问题中，会更有帮助。我认为问题可能是您的df['hour'] 列是timedelta64[ns] 而不是datetime64[ns]
@lanery 在问题中添加了一个示例 df。

标签： python-3.x pandas time-series

【解决方案1】：

在进行了更多研究后，我想发布我自己问题的答案，并希望如果有人遇到此问题，这可能对其他人有所帮助。我在不更改对象类型的情况下尝试了以下代码，

df['m_hour'] = np.where((df['hour'] >= '01:00:00') & (df['hour'] <= '04:00:00'),'1','0')

np.where 子句后缺少括号。

【讨论】：