【问题标题】:Scrapy - NameError: name 'items' is not definedScrapy - NameError:名称“项目”未定义
【发布时间】:2021-01-25 12:22:33
【问题描述】:

我正在尝试用解析后的数据填充我的项目,但出现错误:

item = items()

NameError: name 'items' is not defined**

当我运行scrapy crawl usa_florida_scrapper

这是我的蜘蛛代码:

import scrapy
import re

class UsaFloridaScrapperSpider(scrapy.Spider):
    name            = 'usa_florida_scrapper'
    start_urls      = ['https://www.txlottery.org/export/sites/lottery/Games/index.html']    

    def parse(self, response):    
        item = items()      
        
        print('++++++ Latest Results for Powerball  ++++++++++')
        power_ball_html = (response.xpath("/html/body/div[1]/div[3]/div/div[1]/div[3]/div[2]/ol").extract_first())
        power_balls=(",".join(re.findall(r'<span>(.+)</span>',power_ball_html)))
        power_ball_special=(response.xpath("/html/body/div[1]/div[3]/div/div[1]/div[3]/div[2]/ol/li[6]/span[contains(@class, 'powerball')]/text()").get())        
        power_ball_jackpot = response.xpath('/html/body/div[1]/div[3]/div/div[1]/div[3]/div[1]/h1/text()').get()
        power_ball_multiplier = response.xpath('/html/body/div[1]/div[3]/div/div[1]/div[3]/div[2]/div[2]/h3/span/text()').get()
        
        item['LotteryKey']= '227'
        item['Date']= '2020-10-10'
        item['Balls']= power_balls
        item['SpecialBalls']= power_ball_special
        item['Multiplier']= power_ball_multiplier
        item['JackpotValue']= power_ball_jackpot
        
        
        yield item

这是我的商品代码 items.py:

import scrapy

class KariedanielItem(scrapy.Item):
    # define the fields for your item here like:
    LotteryKey = scrapy.Field()
    Date = scrapy.Field()
    Balls = scrapy.Field()
    SpecialBalls = scrapy.Field()
    Multiplier = scrapy.Field()
    JackpotValue = scrapy.Field()
    pass

【问题讨论】:

    标签: python web-scraping scrapy


    【解决方案1】:

    您正在尝试实例化一个名为 items() 的东西,它在您的代码中的任何地方都不存在。

    def parse(self, response):    
        item = items()      
    

    如果您将 Item 类定义为 KariedanielItem,那么这就是您需要实例化的内容。

    def parse(self, response):    
        item = KariedanielItem()   
    

    请记住,您还需要将该类导入蜘蛛。

    from your_project.items import KariedanielItem
    

    your_project 这里是一个占位符

    【讨论】:

      【解决方案2】:

      首先在顶部导入您的项目类,将 your_project_name 替换为您的 scrapy 项目名称。

      from your_project_name.items import KariedanielItem
      

      接下来在 parse 函数中将这一行 item = items() 替换为 item = FeboughtItem()

      你的代码会是这样的,

      import scrapy
      import re
      from your_project_name.items import KariedanielItem
      
      class UsaFloridaScrapperSpider(scrapy.Spider):
          name            = 'usa_florida_scrapper'
          start_urls      = ['https://www.txlottery.org/export/sites/lottery/Games/index.html']    
      
          def parse(self, response):    
              item = KariedanielItem()      
              
              print('++++++ Latest Results for Powerball  ++++++++++')
              power_ball_html = (response.xpath("/html/body/div[1]/div[3]/div/div[1]/div[3]/div[2]/ol").extract_first())
              power_balls=(",".join(re.findall(r'<span>(.+)</span>',power_ball_html)))
              power_ball_special=(response.xpath("/html/body/div[1]/div[3]/div/div[1]/div[3]/div[2]/ol/li[6]/span[contains(@class, 'powerball')]/text()").get())        
              power_ball_jackpot = response.xpath('/html/body/div[1]/div[3]/div/div[1]/div[3]/div[1]/h1/text()').get()
              power_ball_multiplier = response.xpath('/html/body/div[1]/div[3]/div/div[1]/div[3]/div[2]/div[2]/h3/span/text()').get()
              
              item['LotteryKey']= '227'
              item['Date']= '2020-10-10'
              item['Balls']= power_balls
              item['SpecialBalls']= power_ball_special
              item['Multiplier']= power_ball_multiplier
              item['JackpotValue']= power_ball_jackpot
              
              
              yield item
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2019-01-22
        • 1970-01-01
        • 1970-01-01
        • 2018-01-24
        • 1970-01-01
        • 2021-04-15
        • 2019-01-26
        相关资源
        最近更新 更多