【发布时间】:2017-07-09 14:32:58
【问题描述】:
我正在尝试通过 CLI 使用自定义分隔符运行 scrapy 导出器,如下所示:
scrapy runspider beneficiari_2016.py -o beneficiari_2016.csv -t csv -a CSV_DELIMITER="\n"
导出效果很好,但分隔符仍然是默认的逗号(“,”)。
如果您知道如何修复它,请告诉我。谢谢!
代码:
import scrapy
from scrapy.item import Item, Field
import urllib.parse
class anmdm(Item):
nume_beneficiar = Field()
class BlogSpider(scrapy.Spider):
name = 'blogspider'
start_urls = ['http://www.anm.ro/sponsorizari/afisare-2016/beneficiari?
page=1']
def parse(self, response):
doctor = anmdm()
doctors = []
for item in response.xpath('//tbody/tr'):
doctor['nume_beneficiar'] =
item.xpath('td[5]//text()').extract_first()
yield doctor
next_page = response.xpath("//ul/li[@class='active']/following-
sibling::li/a/@href").extract_first()
if next_page is not None:
next_page = response.urljoin(next_page)
print(next_page)
yield response.follow(next_page, self.parse)
【问题讨论】:
标签: python csv scrapy web-crawler delimiter