【发布时间】:2020-12-29 04:53:29
【问题描述】:
我从网站上抓取了数据,我需要将其转换为 csv 文件,然后读取该文件并显示它
请不要使用 pandas 并将其转换为数据框,然后将其转换为 csv 文件
我想要一种方法,将抓取的数据直接写入 csv 文件,然后还需要读取 csv 文件中的数据并在 python idle 中显示
以下是代码
import requests
from bs4 import BeautifulSoup
start_url="https://www.indeed.co.in/jobs?q=teacher&l=India"
page_data=requests.get(start_url) #sending a http request to the site
soup=BeautifulSoup(page_data.content,"html.parser") #getting that requested data to store in an object
#lists in which the data is going to be appended
Title=[]
Company=[]
Summary=[]
Location=[]
Link_to_apply=[]
for job_tag in soup.find_all("div",class_="jobsearch-SerpJobCard unifiedRow row result"):
title=job_tag.find("h2",class_="title")
company=job_tag.find("span",class_="company")
location=job_tag.find(class_="location accessible-contrast-color-location").text.strip()
summary=job_tag.find("div",class_="summary")
link=job_tag.find("a",href=True)
base_url="https://www.indeed.com"
final_link=base_url+link["href"]
Title.append(title.text.replace('/n'," ").strip()) ###text removes all the unwanted text and gives only the data
Company.append(company.text.replace('\n'," ").strip())## replace() its replces new lines with just 1 space bar
Summary.append(summary.text.replace('\n'," ").strip())#strip() replaces all leading and trailing spaces
Location.append(location.replace('\n'," "))
Link_to_apply.append(final_link)
请注意只能使用python idle
【问题讨论】:
标签: python-3.x list csv web-scraping beautifulsoup