【发布时间】:2020-12-14 16:37:54
【问题描述】:
我正在尝试使用 selenium、beautifulsoup 和 python 抓取动态页面,并且能够抓取第一页。但是当我尝试进入下一页时,url 没有改变,当我检查时,我也无法看到表单数据。有人可以帮助我吗?
import time
from selenium import webdriver
from parsel import Selector
from bs4 import BeautifulSoup
import random
import re
import csv
import requests
import pandas as pd
companies = []
overview = []
people = []
driver = webdriver.Chrome(executable_path=r'C:\\Users\\rahul\Downloads\\chromedriver_win32 (1)\\chromedriver.exe')
driver.get('https://coverager.com/data/companies/')
driver.maximize_window()
src = driver.page_source
soup = BeautifulSoup(src, 'lxml')
table = soup.find('tbody')
descrip = []
table_rows = table.find_all('tr')
for tr in table_rows:
td = tr.find_all('td')
#print(td)
row = [i.text.strip() for i in td]
descrip.append(row)
#print(row)
#file = open('gag.csv','w')
#with file:
# write = csv.writer(file)
# write.writerows(descrip)
url = ('https://coverager.com')
a_tags = table.find_all('a', href = True)
for link in a_tags:
ol = link.get('href')
pl = link.string.strip()
#companies.append(row)
#print(pl)
#print(ol)
driver.get(url + ol)
driver.implicitly_wait(1000)
data1 = driver.find_element_by_class_name('tab-details').text
overview.append(data1.strip())
data2 = driver.find_element_by_link_text('People').click()
p_tags = driver.find_element_by_class_name('tab-details').text
people.append(p_tags)
【问题讨论】:
标签: python selenium web web-scraping beautifulsoup