【问题标题】:Scraping Wikipedia抓取维基百科
【发布时间】:2019-05-24 10:52:27
【问题描述】:
【问题讨论】:
标签:
python
web-scraping
wikipedia
【解决方案1】:
你可以试试下面这样的熊猫
>>>import pandas as pd
>>>table = pd.read_html('https://en.wikipedia.org/wiki/List_of_chemical_elements')
>>>table[1]
【解决方案2】:
我已经找到了第一个问题的答案。谢谢大家。
summary_url =
requests.get('https://en.wikipedia.org/wiki/List_of_chemical_elements').text
summary_soup = bs(summary_url,'html')
summary_table = summary_soup.find('table',{'class':'wikitable sortable collapsible'})
array = []
rows = summary_table.findAll('tr')
header = [col.text for col in rows[1].findAll('th')]
for row in rows[2:-1]:
tmp_row = []
for column in row.findAll('td'):
tmp_row.append(column.text)
array.append(tmp_row)
df_raw = pd.DataFrame(array, columns=header)