【发布时间】:2019-09-25 16:32:03
【问题描述】:
我正在使用 Luminati 代理服务和 Selenium 进行抓取。我这样配置代理:
def get_driver(proxy_server):
executable_path = os.path.join(app.config['JOBS_ROOT'], app.config['BROWSER_DRIVER'])
firefox_profile = webdriver.FirefoxProfile()
firefox_options = FirefoxOptions()
firefox_options.add_argument('-headless')
if proxy_server and proxy_server != LOCAL_HOST:
firefox_profile.set_preference('network.proxy.type', 1)
firefox_profile.set_preference('network.proxy.http', 'zproxy.lum-superproxy.io')
firefox_profile.set_preference('network.proxy.http_port', 22225)
firefox_profile.set_preference('network.proxy.ssl', 'zproxy.lum-superproxy.io')
firefox_profile.set_preference('network.proxy.ssl_port', 22225)
driver = webdriver.Firefox(
executable_path=executable_path,
firefox_options=firefox_options,
firefox_profile=firefox_profile
)
return driver
在此类配置中使用 Luminaty 需要身份验证:
我第一次尝试确保代理 IP 正常工作是使用 https://whatsmyip.org:
driver.get('https://whatsmyip.org')
alert = WebDriverWait(driver, 5).until(expected_conditions.alert_is_present())
alert.send_keys(app.config['LUMINATI_CUSTOMER'] + Keys.TAB + app.config['LUMINATI_PASSWORD'])
alert.accept()
但是,在接受alert.accept() 之后,驱动程序将关闭,甚至不显示页面内容(另外,它似乎非常不稳定,有时可以工作,有时不能)。
所以我最终重复了driver.get() 声明:
driver.get('https://whatsmyip.org')
alert = WebDriverWait(driver, 5).until(expected_conditions.alert_is_present())
alert.send_keys(app.config['LUMINATI_CUSTOMER'] + Keys.TAB + app.config['LUMINATI_PASSWORD'])
alert.accept()
driver.get('https://whatsmyip.org')
但我不认为它应该是这样工作的。
更重要的是,大多数时候我都会收到以下错误:
Traceback (most recent call last):
File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/rq/worker.py", line 812, in perform_job
rv = job.perform()
File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/rq/job.py", line 588, in perform
self._result = self._execute()
File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/rq/job.py", line 594, in _execute
return self.func(*self.args, **self.kwargs)
File "./jobs/scrapping.py", line 185, in scrap_plate_number
record = scrap_and_recognize(driver, vehicle)
File "./jobs/scrapping.py", line 91, in scrap_and_recognize
driver.find_element_by_xpath('//div[contains(@class, "jcrm-botondetalle")]/a').click()
File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 394, in find_element_by_xpath
return self.find_element(by=By.XPATH, value=xpath)
File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 978, in find_element
'value': value})['value']
File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 241, in check_response
raise exception_class(message, screen, stacktrace, alert_text)
selenium.common.exceptions.UnexpectedAlertPresentException: Alert Text: None
Message: Dismissed user prompt dialog: The proxy moz-proxy://zproxy.lum-superproxy.io:22225 is requesting a username and password. The site says: “Luminati”
我不知道为什么 Firefox 会解除警报。有什么线索吗?
【问题讨论】:
标签: python-3.x selenium selenium-firefoxdriver