【发布时间】:2016-09-02 15:10:25
【问题描述】:
我在 python 中创建了一个网络爬虫,但是在最后打印时,我想打印我已经下载的 ("Bakerloo:" + info_from_website),正如您在代码中看到的那样,但它总是像 info_from_website 和忽略“Bakerloo:”字符串。无论如何都找不到解决办法。
import urllib
import urllib.request
from bs4 import BeautifulSoup
import sys
url = 'https://tfl.gov.uk/tube-dlr-overground/status/'
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page,"html.parser")
try:
bakerlooInfo = (soup.find('li',{"class":"rainbow-list-item bakerloo "}).find_all('span')[2].text)
except:
bakerlooInfo = (soup.find('li',{"class":"rainbow-list-item bakerloo disrupted expandable "}).find_all('span')[2].text)
bakerloo = bakerlooInfo.replace('\n','')
print("Bakerloo : " + bakerloo)
【问题讨论】:
标签: python python-3.x web-scraping