【问题标题】:float() argument must be a string or a number, not 'Timestamp': python pandasfloat() 参数必须是字符串或数字,而不是“时间戳”:python pandas
【发布时间】:2020-09-18 05:36:45
【问题描述】:

我有以下命令:

from datetime import datetime
import pandas as pd
import matplotlib.pyplot as plt

data = pd.read_csv('data/C2A2_data/BinnedCsvs_d400/fb441e62df2d58994928907a91895ec62c2c42e6cd075c2700843b89.csv') 
newdata = data[(data['Date'] >= '2005-01-01') & (data['Date'] <= '2014-12-12')]
datamax = newdata[newdata['Element']=='TMAX']
datamin = newdata[newdata['Element']=='TMIN']
datamax = datamax.groupby('Date').max()
datamin = datamin.groupby('Date').min()
datamax = datamax.reset_index()
datamin = datamin.reset_index()
datamax = datamax[~datamax['Date'].str.contains("02-29")]
datamin = datamin[~datamin['Date'].str.contains("02-29")]

breakoutdata = data[(data['Date']  > '2014-12-31')]
datamax2015 = breakoutdata[breakoutdata['Element']=='TMAX']
datamin2015 = breakoutdata[breakoutdata['Element']=='TMIN']
datamax2015 = datamax2015.groupby('Date').max()
datamin2015 = datamin2015.groupby('Date').min()
datamax2015 = datamax2015.reset_index()
datamin2015 = datamin2015.reset_index()
datamax2015 = datamax2015[~datamax2015['Date'].str.contains("02-29")]
datamin2015 = datamin2015[~datamin2015['Date'].str.contains("02-29")]

datamin['Date'] = pd.to_datetime(datamin['Date'])
datamin2015['Date'] = pd.to_datetime(datamin2015['Date'])
datamin["day_of_year"] = datamin["Date"].dt.dayofyear
datamin2015["day_of_year"] = datamin["Date"].dt.dayofyear
dataminappend = datamin2015.join(datamin,on="day_of_year",how="left",lsuffix="2015")
lower = dataminappend.loc[dataminappend["Data_Value2015"]<dataminappend["Data_Value"]]

datamax['Date'] = pd.to_datetime(datamax['Date'])
datamax2015['Date'] = pd.to_datetime(datamax2015['Date'])
datamax["day_of_year"] = datamax["Date"].dt.dayofyear
datamax2015["day_of_year"] = datamax["Date"].dt.dayofyear
datamaxappend = datamax2015.join(datamax,on="day_of_year",how="left",lsuffix="2015")

upper = datamaxappend.loc[datamaxappend["Data_Value2015"]>datamaxappend["Data_Value"]]
upper = upper.rename(columns={"Date2015":"Date","Data_Value2015":"Data_Value"})
lower = lower.rename(columns={"Date2015":"Date","Data_Value2015":"Data_Value"})
upper['Date'] = upper['Date'].map(lambda x: x.strftime("%d/%m/%Y, %H:%M:%S"))
lower['Date'] = lower['Date'].map(lambda x: x.strftime("%d/%m/%Y, %H:%M:%S")) 

ax = plt.gca()
plt.plot(datamax['Date'],datamax['Data_Value'],color='red')
plt.plot(datamax['Date'],datamax['Data_Value'], color='blue')
plt.scatter(upper['Date'],upper['Data_Value'],color='red')
plt.scatter(lower['Date'],lower['Data_Value'], color='blue')

plt.ylabel("Temperature (degrees C)",color='navy')
plt.xlabel("Date",color='navy',labelpad=15)
plt.title('Record high and low temperatures by day (2005-2014)', alpha=1.0,color='brown',y=1.08)
ax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.35),
          fancybox=False,labels=['Record high','Record low'])
plt.xticks(rotation=30)
plt.fill_between(range(len(datamax['Date'])), datamax['Data_Value'], datamin['Data_Value'],color='yellow',alpha=0.8)
plt.show()

但是,我不断收到错误消息:TypeError: float() argument must be a string or a number, not 'Timestamp'

对于这一行:plt.scatter(upper['Date'],upper['Data_Value'],color='red')

有人能帮我一把吗?

我看不出我的代码有什么明显错误...

这是数据帧数据的 sn-p:

 ID        Date Element  Data_Value
0       USW00094889  2014-11-12    TMAX          22
1       USC00208972  2009-04-29    TMIN          56
2       USC00200032  2008-05-26    TMAX         278
3       USC00205563  2005-11-11    TMAX         139
4       USC00200230  2014-02-27    TMAX        -106
5       USW00014833  2010-10-01    TMAX         194
6       USC00207308  2010-06-29    TMIN         144
7       USC00203712  2005-10-04    TMAX         289
8       USW00004848  2007-12-14    TMIN         -16
9       USC00200220  2011-04-21    TMAX          72
10      USC00205822  2013-01-16    TMAX          11
11      USC00205822  2008-05-29    TMIN          28
12      USC00203712  2008-10-17    TMIN          17
13      USC00205563  2006-05-14    TMAX         183
14      USC00200842  2006-05-14    TMAX         122
15      USC00205563  2014-12-07    TMAX          67
16      USC00205822  2008-09-07    TMAX         250
17      USC00205450  2006-04-22    TMIN          67
18      USC00203712  2008-02-22    TMAX         -44
19      USC00205563  2015-01-03    TMIN         -39
20      USC00201250  2011-03-28    TMIN         -78
21      USC00200230  2008-02-10    TMAX          -6
22      USC00207320  2008-02-03    TMIN         -39

[165085 rows x 4 columns]

如果有人想要更简单的方法来操作上面的代码。

【问题讨论】:

  • 如果你能提供一个像这样的小示例数据框也会很有帮助: df = pd.DataFrame(data=[...... 直接使用你的代码。这将是一个minimal reproducible example

标签: python pandas matplotlib plot time


【解决方案1】:
  1. Python 代码返回 2005 年至 2014 年期间按年记录的最高气温和最低气温的折线图。每天的最高气温和最低气温之间的区域应加阴影。
  2. 然后,将 2015 年数据的散点图叠加在 2015 年打破十年记录(2005-2014 年)记录高点或记录低点的任何点(高点和低点)。
  3. 删除闰年日期(即 2 月 29 日)。

/

from datetime import datetime
import pandas as pd
import matplotlib.pyplot as plt

pd.set_option("display.max_rows",None,"display.max_columns",None)
data = pd.read_csv('data/C2A2_data/BinnedCsvs_d400/fb441e62df2d58994928907a91895ec62c2c42e6cd075c2700843b89.csv') 
newdata = data[(data['Date'] >= '2005-01-01') & (data['Date'] <= '2014-12-12')]
datamax = newdata[newdata['Element']=='TMAX']
datamin = newdata[newdata['Element']=='TMIN']
datamax['Date'] = pd.to_datetime(datamax['Date'])
datamin['Date'] = pd.to_datetime(datamin['Date'])
datamax["day_of_year"] = datamax["Date"].dt.dayofyear
datamax = datamax.groupby('day_of_year').max()
datamin["day_of_year"] = datamin["Date"].dt.dayofyear
datamin = datamin.groupby('day_of_year').min()
datamax = datamax.reset_index()
datamin = datamin.reset_index()
datamin['Date'] = datamin['Date'].dt.strftime('%Y-%m-%d')
datamax['Date'] = datamax['Date'].dt.strftime('%Y-%m-%d')
datamax = datamax[~datamax['Date'].str.contains("02-29")]
datamin = datamin[~datamin['Date'].str.contains("02-29")]

breakoutdata = data[(data['Date']  > '2014-12-31')]
datamax2015 = breakoutdata[breakoutdata['Element']=='TMAX']
datamin2015 = breakoutdata[breakoutdata['Element']=='TMIN']
datamax2015['Date'] = pd.to_datetime(datamax2015['Date'])
datamin2015['Date'] = pd.to_datetime(datamin2015['Date'])
datamax2015["day_of_year"] = datamax2015["Date"].dt.dayofyear
datamax2015 = datamax2015.groupby('day_of_year').max()
datamin2015["day_of_year"] = datamin2015["Date"].dt.dayofyear
datamin2015 = datamin2015.groupby('day_of_year').min()
datamax2015 = datamax2015.reset_index()
datamin2015 = datamin2015.reset_index()
datamin2015['Date'] = datamin2015['Date'].dt.strftime('%Y-%m-%d')
datamax2015['Date'] = datamax2015['Date'].dt.strftime('%Y-%m-%d')
datamax2015 = datamax2015[~datamax2015['Date'].str.contains("02-29")]
datamin2015 = datamin2015[~datamin2015['Date'].str.contains("02-29")]

dataminappend = datamin2015.join(datamin,on="day_of_year",rsuffix="_new")
lower = dataminappend.loc[dataminappend["Data_Value_new"]>dataminappend["Data_Value"]]
datamaxappend = datamax2015.join(datamax,on="day_of_year",rsuffix="_new")
upper = datamaxappend.loc[datamaxappend["Data_Value_new"]<datamaxappend["Data_Value"]]

upper['Date'] = pd.to_datetime(upper['Date']) 
lower['Date'] = pd.to_datetime(lower['Date']) 
datamax['Date'] = pd.to_datetime(datamax['Date']) 
datamin['Date'] = pd.to_datetime(datamin['Date']) 

ax = plt.gca()
plt.plot(datamax['day_of_year'],datamax['Data_Value'],color='red')
plt.plot(datamin['day_of_year'],datamin['Data_Value'], color='blue')
plt.scatter(upper['day_of_year'],upper['Data_Value'],color='purple')
plt.scatter(lower['day_of_year'],lower['Data_Value'], color='cyan')

plt.ylabel("Temperature (degrees C)",color='navy')
plt.xlabel("Day of the year",color='navy',labelpad=15)
plt.title('Record high and low temperatures by day between 2005-2014)', alpha=1.0,color='brown',y=1.08)
ax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.35),fancybox=False,labels=['Record high','Record low'])
plt.xticks(rotation=30)
plt.fill_between(range(len(datamax['Date'])), datamax['Data_Value'], datamin['Data_Value'],color='yellow',alpha=0.8)
plt.show()

/

  1. 我已使用 Datamin['Date'] = datamin['Date'].dt.strftime('%Y-%m-%d') 将“日期”列转换为字符串。

  2. 然后我使用 upper['Date'] = pd.to_datetime(upper['Date']) 将其转换回 'datetime' 格式

  3. 然后我使用“日期”作为 x 值。

【讨论】:

    【解决方案2】:
    upper = upper['Date'].map(lambda x: x.strftime("%d/%m/%Y, %H:%M:%S")
    print(upper)
    

    您不能简单地将时间戳转换为字符串,您需要使用 .strftime() 方法。 更多信息在这里:https://www.programiz.com/python-programming/datetime/strftime

    试试这个:

    import pandas as pd
    import datetime
    now = datetime.datetime.now()
    df = pd.DataFrame(data=[[now]]*10, columns=["Date"])
    df['Date_new'] = df['Date'].map(lambda x:x.strftime("%d.%m.%Y, %H:%M:%S"))
    print(df['Date_new'])
    

    输出:

    0    30.05.2020, 18:21:30
    1    30.05.2020, 18:21:30
    2    30.05.2020, 18:21:30
    3    30.05.2020, 18:21:30
    4    30.05.2020, 18:21:30
    5    30.05.2020, 18:21:30
    6    30.05.2020, 18:21:30
    7    30.05.2020, 18:21:30
    8    30.05.2020, 18:21:30
    9    30.05.2020, 18:21:30
    Name: Date_new, dtype: object
    

    【讨论】:

    • lower = lower['Date'].map(lambda x:x.strftime("%d/%m/%Y, %H:%M:%S") ^ SyntaxError: 无效语法我收到此错误
    • lower = lower['Date'].map(lambda x:x.strftime("%d/%m/%Y, %H:%M:%S")) 你错过了一个")" 结尾。
    • 'DataFrame' 对象没有属性 'map' 是我现在给出的错误!
    • 我仍然遇到同样的错误:我该如何解决这个问题?
    • 如果我没有初始数据,很难看出哪里出了问题。您可以使用前 10 行左右的行重新创建数据框,但用于复制粘贴吗?例如。像这样: data =pd.DataFrame(data=[....], columns=[...]) 而不是 data = pd.read_csv('data/C2A2_data/BinnedCsvs_d400/fb441e62df2d58994928907a91895ec62c2c42e6cd075c2700843b89.csv')
    猜你喜欢
    • 2019-05-22
    • 2020-04-24
    • 2021-03-30
    • 2019-07-07
    • 1970-01-01
    • 2020-10-04
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多