【问题标题】:Retrieving game titles only from steam website by using a regex [closed]使用正则表达式仅从 Steam 网站检索游戏标题 [关闭]
【发布时间】:2016-09-21 00:22:11
【问题描述】:

这段代码

gameUrl = 'http://store.steampowered.com/stats/'

gamePage = urlOpen(gameUrl)

gameContents = gamePage.read()

regex = r"""/(href=http:)\/\/(store)\.(steampowered)\.(com)\/(app)\/([0-9]+)\/(">)(\w*\(*\)*(.)*(')*(-)*(:)*\s*)*(<)\/(a>)"""

gameFirst = re.findall(regex, gameContents)

print gameFirst

给出这个结果:

【问题讨论】:

    标签: html regex python-2.7 findall urlopen


    【解决方案1】:

    使用html parser 并使用 div id 获取链接:

    from bs4 import BeautifulSoup
    
    soup = BeautifulSoup(requests.get("http://store.steampowered.com/stats/").text)
    
    print([a["href"] for a in soup.select("#detailStats a[href^=http]")])
    
    ['http://store.steampowered.com/app/570/', 'http://store.steampowered.com/app/730/', 'http://store.steampowered.com/app/440/', 'http://store.steampowered.com/app/377160/', 'http://store.steampowered.com/app/346110/', 'http://store.steampowered.com/app/8930/', 'http://store.steampowered.com/app/4000/', 'http://store.steampowered.com/app/378120/', 'http://store.steampowered.com/app/372000/', 'http://store.steampowered.com/app/252950/', 'http://store.steampowered.com/app/72850/', 'http://store.steampowered.com/app/252490/', 'http://store.steampowered.com/app/230410/', 'http://store.steampowered.com/app/281990/', 'http://store.steampowered.com/app/107410/', 'http://store.steampowered.com/app/374320/', 'http://store.steampowered.com/app/105600/', 'http://store.steampowered.com/app/271590/', 'http://store.steampowered.com/app/304930/', 'http://store.steampowered.com/app/379720/', 'http://store.steampowered.com/app/363970/', 'http://store.steampowered.com/app/292030/', 'http://store.steampowered.com/app/389430/', 'http://store.steampowered.com/app/413150/', 'http://store.steampowered.com/app/386360/', 'http://store.steampowered.com/app/10/', 'http://store.steampowered.com/app/236390/', 'http://store.steampowered.com/app/433850/', 'http://store.steampowered.com/app/311210/', 'http://store.steampowered.com/app/214950/', 'http://store.steampowered.com/app/365590/', 'http://steamcommunity.com/app/295270', 'http://store.steampowered.com/app/359550/', 'http://store.steampowered.com/app/236850/', 'http://steamcommunity.com/app/323370', 'http://store.steampowered.com/app/48700/', 'http://store.steampowered.com/app/550/', 'http://store.steampowered.com/app/238960/', 'http://store.steampowered.com/app/49520/', 'http://store.steampowered.com/app/359870/', 'http://store.steampowered.com/app/304050/', 'http://store.steampowered.com/app/227300/', 'http://store.steampowered.com/app/301520/', 'http://store.steampowered.com/app/427520/', 'http://store.steampowered.com/app/250900/', 'http://store.steampowered.com/app/291550/', 'http://store.steampowered.com/app/218620/', 'http://store.steampowered.com/app/251570/', 'http://store.steampowered.com/app/39210/', 'http://store.steampowered.com/app/221380/', 'http://store.steampowered.com/app/268500/', 'http://store.steampowered.com/app/211420/', 'http://store.steampowered.com/app/240/', 'http://store.steampowered.com/app/346900/', 'http://store.steampowered.com/app/255710/', 'http://store.steampowered.com/app/221100/', 'http://store.steampowered.com/app/325610/', 'http://store.steampowered.com/app/316010/', 'http://store.steampowered.com/app/322330/', 'http://store.steampowered.com/app/220200/', 'http://store.steampowered.com/app/226320/', 'http://store.steampowered.com/app/203770/', 'http://store.steampowered.com/app/227940/', 'http://store.steampowered.com/app/391540/', 'http://store.steampowered.com/app/244850/', 'http://store.steampowered.com/app/394230/', 'http://store.steampowered.com/app/231430/', 'http://store.steampowered.com/app/22380/', 'http://store.steampowered.com/app/201270/', 'http://store.steampowered.com/app/428690/', 'http://store.steampowered.com/app/273110/', 'http://store.steampowered.com/app/10500/', 'http://store.steampowered.com/app/218230/', 'http://store.steampowered.com/app/370240/', 'http://store.steampowered.com/app/268850/', 'http://store.steampowered.com/app/295110/', 'http://store.steampowered.com/app/4700/', 'http://store.steampowered.com/app/359320/', 'http://store.steampowered.com/app/33930/', 'http://store.steampowered.com/app/313160/', 'http://store.steampowered.com/app/262060/', 'http://store.steampowered.com/app/333930/', 'http://store.steampowered.com/app/47890/', 'http://store.steampowered.com/app/291480/', 'http://www.footballmanager.com/', 'http://store.steampowered.com/app/264710/', 'http://store.steampowered.com/app/383120/', 'http://store.steampowered.com/app/322170/', 'http://store.steampowered.com/app/270880/', 'http://store.steampowered.com/app/287700/', 'http://store.steampowered.com/app/10090/', 'http://store.steampowered.com/app/466910/', 'http://store.steampowered.com/app/242760/', 'http://store.steampowered.com/app/226860/', 'http://store.steampowered.com/app/311690/', 'http://store.steampowered.com/app/219990/', 'http://store.steampowered.com/app/334230/', 'http://store.steampowered.com/app/314160/', 'http://store.steampowered.com/app/438040/']
    

    【讨论】:

    • 我只限于使用:urllib、findall、HTML Parser
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2016-02-13
    • 2022-01-21
    • 2012-11-16
    • 2017-12-09
    • 2015-02-24
    • 1970-01-01
    • 2015-01-04
    相关资源
    最近更新 更多