【发布时间】:2014-12-03 04:13:24
【问题描述】:
我正在尝试在此页面上进行无限滚动,这是我的代码:
from selenium import webdriver
import time
profile = webdriver.FirefoxProfile()
profile.set_preference("general.useragent.override","Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:28.0) Gecko/20100101 Firefox/28.0")
driver = webdriver.Firefox(profile)
driver.get("http://www.quora.com/Programming-Languages/followers")
for n in range(0,5): # For testing I have capped this at 5, will handle this properly once things start to work.
driver.execute_script("window.scrollTo(0,1000000);")
time.sleep(2)
因此,当我运行此程序时,它会在进行任何滚动之前等待很多秒(有时超过 1 分钟),然后在下一次滚动之前再次等待相同的时间。 该代码似乎在其他页面上运行良好。 有关如何解决此问题的任何想法?
当我尝试使用 Chrome 而不是 firefox 时,我收到以下错误:
driver = webdriver.Chrome('/home/asdf/apps/chromedrive/chromedriver') 添加到 .py 文件中。
Traceback (most recent call last):
File "ok.py", line 8, in <module>
driver = webdriver.Chrome('/home/asdf/apps/chromedrive/chromedriver')
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/chrome/webdriver.py", line 65, in __init__
keep_alive=True)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 73, in __init__
self.start_session(desired_capabilities, browser_profile)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 121, in start_session
'desiredCapabilities': desired_capabilities,
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 171, in execute
response = self.command_executor.execute(driver_command, params)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/remote_connection.py", line 349, in execute
return self._request(command_info[0], url, body=data)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/remote_connection.py", line 379, in _request
self._conn.request(method, parsed_url.path, body, headers)
File "/usr/lib/python2.7/httplib.py", line 973, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python2.7/httplib.py", line 1007, in _send_request
self.endheaders(body)
File "/usr/lib/python2.7/httplib.py", line 969, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 829, in _send_output
self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 791, in send
self.connect()
File "/usr/lib/python2.7/httplib.py", line 772, in connect
self.timeout, self.source_address)
File "/usr/lib/python2.7/socket.py", line 553, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
socket.gaierror: [Errno -2] Name or service not known
【问题讨论】:
标签: python firefox selenium selenium-webdriver web-scraping