【问题标题】:looping url's with 'for' loop and string vs integer problems使用“for”循环和字符串与整数问题循环 url
【发布时间】:2018-11-22 11:00:44
【问题描述】:

试图弄清楚如何使用“for”循环(或任何其他循环)来循环 url

我正在抓取 Howlongtobeat.com 的数据,网址的结构如下:

https://howlongtobeat.com/game.php?id=38050

只有“id=”结尾的数字发生变化,如何让字符串的结尾改变数字??

page_number = range (38040, 38060)

url = 'https://howlongtobeat.com/game.php?id={page_number}'

这不起作用,因为我没有添加到字符串中

url = 'https://howlongtobeat.com/game.php?id=' + page_number 

没有工作,因为我遇到了这个错误

 TypeError: must be str, not range

仅供参考,使用 beautifulsoup 和 csv writer 来抓取数据并将其写入 csv

我是这方面的初学者,所以从头开始

谢谢!!!!!!!

【问题讨论】:

标签: python python-3.x url web-scraping beautifulsoup


【解决方案1】:
from bs4 import BeautifulSoup

url = 'https://howlongtobeat.com/game.php?id='

for page in range(38040, 38060):
    new_url = url + str(page)
    print(new_url)

输出:

C:\Users\siva\Desktop>python test.py
https://howlongtobeat.com/game.php?id=38040
https://howlongtobeat.com/game.php?id=38041
https://howlongtobeat.com/game.php?id=38042
https://howlongtobeat.com/game.php?id=38043
https://howlongtobeat.com/game.php?id=38044
https://howlongtobeat.com/game.php?id=38045
https://howlongtobeat.com/game.php?id=38046
https://howlongtobeat.com/game.php?id=38047
https://howlongtobeat.com/game.php?id=38048
https://howlongtobeat.com/game.php?id=38049
https://howlongtobeat.com/game.php?id=38050
https://howlongtobeat.com/game.php?id=38051
https://howlongtobeat.com/game.php?id=38052
https://howlongtobeat.com/game.php?id=38053
https://howlongtobeat.com/game.php?id=38054
https://howlongtobeat.com/game.php?id=38055
https://howlongtobeat.com/game.php?id=38056
https://howlongtobeat.com/game.php?id=38057
https://howlongtobeat.com/game.php?id=38058
https://howlongtobeat.com/game.php?id=38059

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2011-09-18
    • 2015-01-21
    • 2017-12-24
    • 1970-01-01
    • 1970-01-01
    • 2011-04-28
    • 2019-08-31
    相关资源
    最近更新 更多