【问题标题】:Get tables with web-scraping in python在 python 中获取带有 web-scraping 的表
【发布时间】:2021-01-03 13:22:59
【问题描述】:
import requests
from bs4 import BeautifulSoup


url = 'https://www.universitego.com/bilgisayar-muhendisligi-2021-taban-puanlari-ve-basari-siralamalari/'
soup = BeautifulSoup(requests.get(url).content.decode('utf-8', 'ignore'), 'html.parser')

for span in soup.select('tr > td:nth-child(1)'):
    print(span.get_text(strip=True, separator=' '))
    print('-' * 80)

我使用上面的代码从以下网站获取部门和有关部门的表格。但是,我运行后得到了一个空列表。我该怎么办?谢谢。

网站 https://www.universitego.com/4-yillik-bolumlerin-2015-2016-taban-puanlari-ve-basari-siralamalari/ https://www.universitego.com/bilgisayar-muhendisligi-2021-taban-puanlari-ve-basari-siralamalari/

【问题讨论】:

  • 我假设您的表格是指包含所有 Acil Yardım ve Afet Yönetimi 2021 Taban Puanları 值的列表?

标签: python web-scraping beautifulsoup python-requests urllib


【解决方案1】:

你来了

from bs4 import BeautifulSoup
from requests import get
r=get('https://www.universitego.com/bilgisayar-muhendisligi-2021-taban-puanlari-ve-basari-siralamalari/')
soup=BeautifulSoup(r.content, features='lxml')
resulting_list_of_dicts=[]
keys=soup.find('table').find('tbody').findAll('tr')[0].text.split('\n')

for values in  [i.text for i in soup.find('table').find('tbody').findAll('tr')[1:]]:
    resulting_list_of_dicts.append(dict(zip(keys,values.split('\n'))))

resulting_list_of_dicts[0]
'Üniversite Adı':'BOĞAZİÇİ ÜNİVERSİTESİ (İSTANBUL) (Devlet Üniversitesi)'
'Bölüm':'Bilgisayar Mühendisliği (İngilizce)'
'Puan Türü':'SAY'
'Kont.':'85'
'Taban Puanı':'546,34716'
'Başarı Sırası':'643'

【讨论】:

    猜你喜欢
    • 2017-01-24
    • 2019-01-31
    • 2018-12-31
    • 2021-01-05
    • 1970-01-01
    • 2016-06-03
    • 2020-04-17
    • 2020-10-07
    • 1970-01-01
    相关资源
    最近更新 更多