【问题标题】:Why Field names are sliced and raise error in python csv writing为什么字段名称被切片并在 python csv 写入中引发错误
【发布时间】:2015-12-31 17:47:40
【问题描述】:

在练习 selenium 期间,我未能将字典写入 csv。我搜索了像 it 这样的问题解决方案,但它对我没有帮助。我的问题是当我想使用 dictwriter 将 python 字典写入 csv 文件时,我遇到异常,即

ValueError: dict contains fields not in fieldnames: u'S', u'k', u'u'

但是字段名是

为什么它被切片并给了我奇怪的例外,但我在 dictwriter 中提供了正确的文件名。

我的实验代码。是-

import os,sys,bs4,random,codecs,requests
import unicodecsv as csv
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from contextlib import contextmanager
from selenium.webdriver.support.expected_conditions import staleness_of
from selenium.webdriver.support import expected_conditions as EC



current_file =  sys.argv[0]
link_dir = os.path.dirname(current_file)
link_path = os.path.join(link_dir,'lnks.txt')

Image_folder = os.path.join(link_dir,"images")+"\\"

urls = [line.strip() for line in open(link_path, 'r')]
urls = list(set(urls))
url = urls[0]



driver = webdriver.Firefox()#Chrome()##chromedriver)##

base_url = 'http://www.hotleathers.com'

Header = [u'Url',u'Name',u'Sku',u'Price',u'Color',u'Size']
#def get_data(url):
#try:
print "Scraping : %s"%url
driver.get(url)
driver.implicitly_wait(3)
detpage_lnks = driver.find_elements_by_xpath("//div[@style='margin-top:0px;margin-bottom:5px']/a")
detpage_lnks = map(lambda x: x.get_attribute('href'),detpage_lnks)
for i in detpage_lnks:
    Data = []
    #try:
    driver.get(i)
    driver.implicitly_wait(3)
    Name_v=driver.find_element_by_xpath("//table [@class='showproductpage']/tbody/tr/td/h1").text
    Sku_v=driver.find_element_by_xpath("(//table[@cellspacing = '0'])[3]//td[@style='padding-left:5px; font-size:16px; font-weight:bold;']").text
    image_name = Sku_v+".jpg"
    image_url = "http://www.hotleathers.com/Assets/ProductImages/large/"+image_name
    res = requests.get(image_url)
    if res.status_code == requests.codes.ok:
        out = open(Image_folder+image_name,'wb')
        out.write(res.content)        
    Price_v=driver.find_element_by_xpath("((//table[@cellspacing = '0'])[3]//tr)[2]//span").text
    Color=driver.find_elements_by_xpath("(//table[@class='buyProductForm'])//tr[2]/td/select/option")
    Color_v = '"'+':'.join([i.text for i in Color[1:]])+'"'
    Size=driver.find_elements_by_xpath("(//table[@class='buyProductForm'])//tr[3]/td/select/option")
    Size_v = '"'+':'.join([i.text for i in Size[1:]])+'"'
    temp = [driver.current_url,Name_v,Sku_v,Price_v,Color_v,Size_v]
    Data.append(zip(Header,temp))
    Data = [item for sublst in Data for item in sublst]
    my_dict = dict(Data)
    with codecs.open(os.path.join(link_dir,"Image_info.csv"),'wb',encoding="utf-8") as f:
        # Using dictionary keys as fieldnames for the CSV file header
        writer = csv.DictWriter(f,delimiter=",", fieldnames=Header,lineterminator='\n')
        writer.writeheader()
        for d in my_dict:
            writer.writerow(d)             

driver.close()

我尝试了 unicodecsv 和 csv,但都没有成功。

【问题讨论】:

  • 请阅读指南How do I ask a good question,尤其是关于最小、完整和可验证示例(MCVE)的部分。这将帮助您自己解决问题。如果您这样做但仍然卡住,您可以回来发布您的 MCVE、您尝试了什么以及结果如何,以便我们更好地帮助您。

标签: csv selenium python-2.6


【解决方案1】:

经过多次尝试,我找到了如下解决方案- 我不明白 writerow 需要字典!

import os,sys,bs4,random,codecs,requests
import unicodecsv as csv
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from contextlib import contextmanager
from selenium.webdriver.support.expected_conditions import staleness_of
from selenium.webdriver.support import expected_conditions as EC



current_file =  sys.argv[0]
link_dir = os.path.dirname(current_file)
link_path = os.path.join(link_dir,'lnks.txt')

Image_folder = os.path.join(link_dir,"images")+"\\"

urls = [line.strip() for line in open(link_path, 'r')]
urls = list(set(urls))
url = urls[0]



driver = webdriver.Firefox()#Chrome()##chromedriver)##

base_url = 'http://www.hotleathers.com'

Header = [u'Url',u'Name',u'Sku',u'Price',u'Color',u'Size']
#def get_data(url):
#try:
print "Scraping : %s"%url
driver.get(url)
driver.implicitly_wait(3)
detpage_lnks = driver.find_elements_by_xpath("//div[@style='margin-top:0px;margin-bottom:5px']/a")
detpage_lnks = map(lambda x: x.get_attribute('href'),detpage_lnks)
for i in detpage_lnks:
    Data = []
    #try:
    driver.get(i)
    driver.implicitly_wait(3)
    Name_v=driver.find_element_by_xpath("//table [@class='showproductpage']/tbody/tr/td/h1").text
    Sku_v=driver.find_element_by_xpath("(//table[@cellspacing = '0'])[3]//td[@style='padding-left:5px; font-size:16px; font-weight:bold;']").text
    image_name = Sku_v+".jpg"
    image_url = "http://www.hotleathers.com/Assets/ProductImages/large/"+image_name
    res = requests.get(image_url)
    if res.status_code == requests.codes.ok:
        out = open(Image_folder+image_name,'wb')
        out.write(res.content)        
    Price_v=driver.find_element_by_xpath("((//table[@cellspacing = '0'])[3]//tr)[2]//span").text
    Color=driver.find_elements_by_xpath("(//table[@class='buyProductForm'])//tr[2]/td/select/option")
    Color_v = '"'+':'.join([i.text for i in Color[1:]])+'"'
    Size=driver.find_elements_by_xpath("(//table[@class='buyProductForm'])//tr[3]/td/select/option")
    Size_v = '"'+':'.join([i.text for i in Size[1:]])+'"'
    temp = [driver.current_url,Name_v,Sku_v,Price_v,Color_v,Size_v]
    Data.append(zip(Header,temp))
    Data = [item for sublst in Data for item in sublst]
    my_dict = dict(Data)
    with codecs.open(os.path.join(link_dir,"Image_info.csv"),'ab',encoding="utf-8") as f:
        # Using dictionary keys as fieldnames for the CSV file header
        writer = csv.DictWriter(f,fieldnames=my_dict.keys())
        writer.writerow(my_dict)             

driver.close()

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2017-10-22
    • 1970-01-01
    • 1970-01-01
    • 2021-04-10
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多