网站入口:http://wise.xmu.edu.cn/people/faculty
爬取信息:姓名和主页地址
python版本:3.5
import requests
r = requests.get(\'http://www.wise.xmu.edu.cn/people/faculty\')
html = r.content
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, \'xml\')
div_people_list = soup.find(\'div\', attrs={\'class\': \'people_list\'})
a_s = div_people_list.find_all(\'a\', attrs={\'target\': \'_blank\'})
for a in a_s:
url = a[\'href\']
name = a.get_text()
print(name, url)
输出:
敖萌幪 /people/faculty/494d4f1c-0470-4f53-8b7c-d3594241876b.html
Bowers, Roslyn /people/faculty/d01fe119-7980-4238-a3ec-abb9b66ec706.html
Brown, Katherine /people/faculty/36c6b263-2cc2-4682-9975-02b75e6505f7.html
鲍小佳 /people/faculty/bdc3fd77-84de-4020-846d-344e02f110e9.html
Chang, Seong Yeon /people/faculty/0534965d-6393-4e22-a6bb-6ac3b11fe431.html
蔡熙乾 /people/faculty/95d97944-beb6-4a47-af85-a0778e1788b2.html