【发布时间】:2018-10-02 20:43:39
【问题描述】:
我正在尝试从页面中获取文本元素。为了获得这个元素,我的脚本点击了页面上的两个过滤器。我需要抓取 5,000 页。该脚本在收集文本元素方面起作用,但是,在一定数量的页面之后,它总是返回一条消息“元素不可见”。我假设这是由于页面没有及时加载,因为我检查了它中断的页面并且文本元素在那里。 (每次点击后我已经实现了 time.sleep(3) )。如果没有及时加载,我可以在我的脚本中使用什么来跳过该页面?
def yelp_scraper(url):
driver.get(url)
# get total number of restaurants
total_rest_loc = '//span[contains(text(),"Showing 1")]'
total_rest_raw = driver.find_element_by_xpath(total_rest_loc).text
total_rest = int(re.sub(r'Showing 1.*of\s','',total_rest_raw))
button1 = driver.find_element_by_xpath('//span[@class="filter-label filters-toggle js-all-filters-toggle show-tooltip"]')
button1.click()
time.sleep(1)
button2 = driver.find_element_by_xpath('//span[contains(text(),"Walking (1 mi.)")]')
button2.click()
time.sleep(2)
rest_num_loc = '//span[contains(text(),"Showing 1")]'
rest_num_raw = driver.find_element_by_xpath(rest_num_loc).text
rest_num = int(re.sub(r'Showing 1.*of\s','',rest_num_raw))
if total_rest==rest_num:
button3 = driver.find_element_by_xpath('//span[contains(text(),"Biking (2 mi.)")]')
button3.click()
time.sleep(2)
button4 = driver.find_element_by_xpath('//span[contains(text(),"Walking (1 mi.)")]')
button4.click()
time.sleep(2)
rest_num_loc = '//span[contains(text(),"Showing 1")]'
rest_num_raw = driver.find_element_by_xpath(rest_num_loc).text
rest_num = int(re.sub(r'Showing 1.*of\s','',rest_num_raw))
return(rest_num)
chromedriver = "/Applications/chromedriver" # path to the chromedriver executable
os.environ["webdriver.chrome.driver"] = chromedriver
chrome_options = Options()
# add headless mode
chrome_options.add_argument("--headless")
# turn off image loading
prefs = {"profile.managed_default_content_settings.images":2}
chrome_options.add_experimental_option("prefs",prefs)
driver = webdriver.Chrome(chromedriver, chrome_options=chrome_options)
for url in url_list:
yelp_data[url] = yelp_scraper(url)
json.dump(yelp_data, open('../data/yelp_json/yelp_data.json', 'w'), indent="\t")
driver.close()
【问题讨论】:
-
请您的代码试用。
-
我把它放在上面了
标签: python-3.x selenium error-handling web-scraping