【发布时间】:2021-08-11 12:07:16
【问题描述】:
我正在尝试使 python 刮板代码工作,但我做不到,一点帮助会很有用,我还是个初学者。代码运行正常,但它崩溃并将单个作业导出到我的 csv,我认为这是随机的并且不会给出任何错误。请有更多经验的人可以帮助我提供一些提示。提前致谢。
from selenium import webdriver
import pandas as pd
from bs4 import BeautifulSoup
options = webdriver.FirefoxOptions()
driver = webdriver.Firefox()
driver.maximize_window()
df = pd.DataFrame(columns=["Title","Location","Company","Salary","Sponsored","Description"])
for i in range(25):
driver.get('https://www.indeed.co.in/jobs?q=artificial%20intelligence&l=India&start='+str(i))
jobs = []
driver.implicitly_wait(20)
for job in driver.find_elements_by_class_name('result'):
soup = BeautifulSoup(job.get_attribute('innerHTML'),'html.parser')
try:
title = soup.find("a",class_="jobtitle").text.replace("\n","").strip()
except:
title = 'None'
try:
location = soup.find(class_="location").text
except:
location = 'None'
try:
company = soup.find(class_="company").text.replace("\n","").strip()
except:
company = 'None'
try:
salary = soup.find(class_="salary").text.replace("\n","").strip()
except:
salary = 'None'
try:
sponsored = soup.find(class_="sponsoredGray").text
sponsored = "Sponsored"
except:
sponsored = "Organic"
sum_div = job.find_element_by_class_name('summary')
try:
sum_div.click()
except:
close_button = driver.find_elements_by_class_name('popover-x-button-close')[0]
close_button.click()
sum_div.click()
driver.implicitly_wait(2)
try:
job_desc = driver.find_element_by_css_selector('div#vjs-desc').text
print(job_desc)
except:
job_desc = 'None'
df = df.append({'Title':title,'Location':location,"Company":company,"Salary":salary,
"Sponsored":sponsored,"Description":job_desc},ignore_index=True)
df.to_csv(r"C:\Users\Desktop\Python\Newtest.csv",index=False)
【问题讨论】:
-
这似乎是一个缩进问题。我的答案中的代码给了我 1931 行的 CSV 文件。
标签: python selenium beautifulsoup webdriver selenium-firefoxdriver