【问题标题】:Python Nested Loop FormatPython 嵌套循环格式
【发布时间】:2013-11-18 18:15:54
【问题描述】:

我正在使用以下代码从 Yahoo! 抓取数据然而,财务方面,我对第一个异常语句所做的修改导致代码循环......根本没有将任何内容写入 CSV 文件。目前,我正在通过调试器解决此错误。我认为是在while循环中引起的错误。

import urllib2
from BeautifulSoup import BeautifulSoup
import csv
import re
import urllib
from urllib2 import HTTPError
# import modules

symbolfile = open("symbols.txt")
symbolslist = symbolfile.read()
newsymbolslist = symbolslist.split("\n")

i = 0

f = csv.writer(open("pe_ratio.csv","wb"))
# short cut to write

f.writerow(["Name","PE","Revenue % Quarterly","ROA% YOY","Operating Cashflow","Debt to Equity"])
#first write row statement

# define name_company as the following
while i<len(newsymbolslist):
    try:
        page = urllib2.urlopen("http://finance.yahoo.com/q/ks?s="+newsymbolslist[i] +"%20Key%20Statistics").read()
    except urllib2.HTTPError:
        continue
        soup = BeautifulSoup(page)
        name_company = soup.findAll("div", {"class" : "title"}) 
        for name in name_company: #add multiple iterations?        
            all_data = soup.findAll('td', "yfnc_tabledata1")
            stock_name = name.find('h2').string #find company's name in name_company with h2 tag
            try:    
                f.writerow([stock_name, all_data[2].getText(),all_data[17].getText(),all_data[13].getText(), all_data[29].getText(),all_data[26].getText()]) #write down PE data
            except (IndexError, HTTPError) as e:
                pass
            i+=1    

提前感谢您的帮助。

【问题讨论】:

  • 这是您的实际缩进吗?提示:出现在continue 语句之后的任何代码,在相同的缩进级别,将永远不会执行。

标签: python exception nested-loops


【解决方案1】:

尝试将 i = i + 1 提前一个缩进,这样它就在单独的 for 循环之外?我还将“继续”之后的代码缩进与 try 和 except 放在同一行,否则它只会在例外情况下运行(或者因为 continue 而完全通过)。

while i<len(newsymbolslist):
    try:
        page = urllib2.urlopen("http://finance.yahoo.com/q/ks?s="+newsymbolslist[i] +"%20Key%20Statistics").read()
    except urllib2.HTTPError:
        continue
    soup = BeautifulSoup(page)
    name_company = soup.findAll("div", {"class" : "title"}) 
    for name in name_company: #add multiple iterations?        
        all_data = soup.findAll('td', "yfnc_tabledata1")
        stock_name = name.find('h2').string #find company's name in name_company with h2 tag
        try:    
            f.writerow([stock_name, all_data[2].getText(),all_data[17].getText(),all_data[13].getText(), all_data[29].getText(),all_data[26].getText()]) #write down PE data
        except (IndexError, HTTPError) as e:
            pass
    i+=1    

【讨论】:

    【解决方案2】:

    我认为缩进是错误的。试试这个:

    from BeautifulSoup import BeautifulSoup
    import csv
    import re
    import urllib
    from urllib2 import HTTPError
    # import modules
    
    symbolfile = open("symbols.txt")
    symbolslist = symbolfile.read()
    newsymbolslist = symbolslist.split("\n")
    
    i = 0
    
    f = csv.writer(open("pe_ratio.csv","wb"))
    # short cut to write
    
    f.writerow(["Name","PE","Revenue % Quarterly","ROA% YOY","Operating Cashflow","Debt to Equity"])
    #first write row statement
    
    # define name_company as the following
    while i<len(newsymbolslist):
        try:
            page = urllib2.urlopen("http://finance.yahoo.com/q/ks?s="+newsymbolslist[i] +"%20Key%20Statistics").read()
        except urllib2.HTTPError:
            continue
        soup = BeautifulSoup(page)
        name_company = soup.findAll("div", {"class" : "title"}) 
        for name in name_company: #add multiple iterations?        
            all_data = soup.findAll('td', "yfnc_tabledata1")
            stock_name = name.find('h2').string #find company's name in name_company with h2 tag
            try:    
                f.writerow([stock_name, all_data[2].getText(),all_data[17].getText(),all_data[13].getText(), all_data[29].getText(),all_data[26].getText()]) #write down PE data
            except (IndexError, HTTPError) as e:
                pass
        i+=1
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2016-12-10
      • 2010-12-21
      • 2017-04-24
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-12-30
      • 2012-06-27
      相关资源
      最近更新 更多