【问题标题】:TypeError: 'Timestamp' object is not subscriptableTypeError:“时间戳”对象不可下标
【发布时间】:2018-07-04 06:39:38
【问题描述】:

我正在可视化选举民意调查数据集,需要使用 2012 年 10 月的数据,但它给了我一个错误。

import pandas as pd
from pandas import Series,DataFrame
import numpy as np
poll_df=pd.read_csv('http://elections.huffingtonpost.com/pollster/2012-general-election-romney-vs-obama.csv')
row_in=0
xlimit=[]
for date in poll_df['Start Date']:
    if date[0:7] == '2012-10':
        xlimit.append(row_in)
        row_in += 1
    else:
        row_in+=1
print(min(xlimit))
print(max(xlimit))

为什么会出现这个错误,这是什么意思?

【问题讨论】:

  • 表示date[0:7] 没有意义;该列包含时间戳,而不是字符串。

标签: python-3.x pandas numpy data-science


【解决方案1】:

用途:

url ='http://elections.huffingtonpost.com/pollster/2012-general-election-romney-vs-obama.csv'

使用日期时间的解决方案 - 将read_csv 中的列转换为dates,然后按strftime 比较字符串并按boolean indexing 过滤:

poll_df = pd.read_csv(url, parse_dates=['Start Date'])

df = poll_df[poll_df['Start Date'].dt.strftime('%Y-%m') == '2012-10']

print(df['Start Date'].dtype)
datetime64[ns]

字符串解决方案 - 通过indexing with str提取前7个值:

poll_df = pd.read_csv(url)

df = poll_df[poll_df['Start Date'].str[:7] == '2012-10']

print(df['Start Date'].dtype)
object

print(df.head())

                Pollster Start Date    End Date  Entry Date/Time (ET)  \
18                YouGov 2012-10-31  2012-11-03  2012-11-04T16:24:50Z   
19                   Pew 2012-10-31  2012-11-03  2012-11-04T15:46:59Z   
21             Rasmussen 2012-10-31  2012-11-02  2012-11-03T10:54:09Z   
22     Purple Strategies 2012-10-31  2012-11-01  2012-11-02T12:31:41Z   
23  JZ Analytics/Newsmax 2012-10-30  2012-11-01  2012-11-02T22:57:27Z   

    Number of Observations     Population             Mode  Obama  Romney  \
18                 36472.0  Likely Voters         Internet   49.0    47.0   
19                  2709.0  Likely Voters       Live Phone   48.0    45.0   
21                  1500.0  Likely Voters  Automated Phone   48.0    48.0   
22                  1000.0  Likely Voters       IVR/Online   47.0    46.0   
23                  1030.0  Likely Voters         Internet   48.0    46.0   

    Undecided  Other                                       Pollster URL  \
18        3.0    NaN  http://elections.huffingtonpost.com/pollster/p...   
19        NaN    3.0  http://elections.huffingtonpost.com/pollster/p...   
21        2.0    1.0  http://elections.huffingtonpost.com/pollster/p...   
22        7.0    NaN  http://elections.huffingtonpost.com/pollster/p...   
23        6.0    NaN  http://elections.huffingtonpost.com/pollster/p...   

                                           Source URL     Partisan  \
18  http://cdn.yougov.com/r/1/ygTabs_november_like...  Nonpartisan   
19  http://www.people-press.org/2012/11/04/obama-g...  Nonpartisan   
21  http://www.rasmussenreports.com/public_content...  Nonpartisan   
22  http://www.purplestrategies.com/wp-content/upl...  Nonpartisan   
23                        http://www.jzanalytics.com/      Sponsor   

   Affiliation  Question Text  Question Iteration  
18        None            NaN                   1  
19        None            NaN                   1  
21        None            NaN                   1  
22        None            NaN                   1  
23         Rep            NaN                   1  

【讨论】:

  • 谢谢你,但是当我这样做时它开始给我错误,因为 min() arg 是一个空序列 现在我该怎么办?
  • @Ben.hardy - 你的熊猫版本是什么?
  • 熊猫版本 0.23.0
  • 它给了我这个-> ValueError: min() arg is an empty sequence on the statement print(min(xlimit))
  • @Ben.hardy - 但为什么需要它?我没有在我的回答中使用它。您对min(xlimit)) 的期望是什么?
猜你喜欢
  • 2016-07-20
  • 1970-01-01
  • 2017-07-15
  • 2021-10-01
  • 2019-12-07
  • 2012-01-09
  • 2021-11-23
  • 2012-02-21
  • 2018-08-07
相关资源
最近更新 更多