【发布时间】:2021-01-13 16:11:13
【问题描述】:
我正在为食谱网站制作网络爬虫,我想获取食谱的链接,然后使用该链接获取成分。我能够做到这一点,但只能通过手动输入链接来获取食谱。有没有办法获取链接然后使用此链接查看成分。另外,我会就如何使这段代码变得更好提出任何建议!
def trade_spider():
url= 'https://tasty.co/topic/best-vegetarian'
source_code = requests.get(url)
plain_text = source_code.text
soup = BeautifulSoup(plain_text, 'lxml')
for link in soup.find_all('a', {'class':'feed-item analyt-internal-link-subunit'}):
test = link.get('href')
print(test)
def ingredient_spider():
url1= 'https://tasty.co/recipe/peanut-butter-keto-cookies'
source_code1= requests.get(url1)
new_text= source_code1.text
soup1= BeautifulSoup(new_text, 'lxml')
for ingredients in soup1.find_all("li", {"class": "ingredient xs-mb1 xs-mt0"}):
print(ingredients.text)
【问题讨论】:
标签: python beautifulsoup web-crawler