BeautifulSoap的应用

为了前面的对教务处成绩的脚本进行进一步的优化，我对其中字符串的筛选使用了BeautifuSoup模块。

首先创建BeautifulSoup对象：

soup = BeautifulSoup(requests.get('http://run.hbut.edu.cn/StuGrade/Index',headers=headersgrade).text,"html.parser")

用requests返回一个BeautifulSoup对象放入soup中。

由于soup本身我们可以把它当作字典或数组来看待，其中有几个方法：

print soup.a #其实这里的a可以是任何html中的标签

这里返回的值是a标签及其中的内容，如果想要只返回a标签中的内容的话，我们可以使用string，即

print soup.a.string #返回的即a标签中的内容

还有一个find_all方法，所需要传入的参数为标签名字，返回值为带相应标签的所有内容。这里我选择新建一个数组对象，用这个数组来装一下返回后被整理的字符串内容。

首先将soup对象中的目标元素遍历进newsoup数组：

newsoup = []
for i in soup.find_all("td"):
    newsoup.append(i.get_text().replace(' ','').replace('\n','').replace('\r',''))

并且将每一个soup元素返回其中的值，且去掉无关字符串。

最后就是简单的排版和对绩点的计算，这里放代码，就不一一赘述了。

n = 0
xuefen = 0.00
jx = 0.00
print('           Design by MinYuandong\n'+'         '+ str(soup.h2.string))
print("——*——*——*——*——*——*——*——*——*——*——\n   课程名         绩点         学分         总成绩\n")
while n <= len(newsoup)/9 - 1:
    print(newsoup[9*n+1]+'       '+newsoup[9*n+3]+'       '+newsoup[9*n+4]+'       '+newsoup[9*n+5]+'\n')
    jx = jx + float(newsoup[9*n+3])*float(newsoup[9*n+4])
    xuefen = xuefen + float(newsoup[9*n+4])
    n = n + 1
print('您在本学期的平均绩点为：'+ str(jx/xuefen))
print('——*——*——*——*——*——*——*——*——*——*——')

晒一下结果：

BeautifulSoap的应用

完成！