【发布时间】:2017-09-28 11:18:00
【问题描述】:
我想从一个每隔一段时间刷新一次的列表中一次提取一个代理,我对此没有任何问题。
有些代理不好,我希望使用列表中的下一个。这就是我的生成器的用武之地,然而,虽然我可以通过第一次调用 .next() 来让生成器滚动,但第二次调用它时,我会得到相同的值!
显然,在理解生成器的工作原理方面,我一定遗漏了一个关键部分。
我的生成器代码在 ProxyHandler 类中:
class ProxyHandler:
def __init__(self):
self.proxies = list()
self.current = dict()
def get_proxies(self):
""" Retrieves proxies """
def __len__(self):
return len(self.proxies)
def yield_proxy(self):
if not self.proxies:
print 'Created new proxy list'
self.get_proxies() # This populates self.proxies which is a list of tuples where the 0th element is the host and the 1st element is the port
for p in self.proxies:
try:
proxy = {'http': 'http://%s:%s' % (p[0], p[1])} # Formatted to python's request lib proxy format
self.current = proxy
yield proxy
except StopIteration:
print 'Reached end of proxy list'
self.current = {}
self.get_proxies()
yield self.yield_proxy()
及用法:
def get_response(self, url):
proxy = self.proxy_handler.current
if proxy == {}:
proxy = self.proxy_handler.yield_proxy().next()
print 'Current proxy -', proxy
response = url_request(url, proxy=proxy) # url_request() is basically a modified version of python's requests
print response
if response: # url_request() returns true if status code == 200
return response, proxy
gen = self.proxy_handler.yield_proxy()
gen.next()
return self.get_ebay_response(url)
【问题讨论】:
标签: python python-2.7 recursion python-requests generator