【问题标题】:Python mechanize not recognizing formPython机械化无法识别形式
【发布时间】:2014-02-06 01:17:50
【问题描述】:

我正在尝试登录 this page

br = mechanize.Browser(factory=mechanize.RobustFactory())
br.set_cookiejar(cj)

current_page = br.open(LOGIN_URL)
soup = BeautifulSoup(current_page.get_data())
current_page.set_data(soup.prettify())
br.set_response(current_page)

 print soup.findAll('form')

 assert br.viewing_html()

 for f in br.forms():
     print f.name

但即使 BeautifulSoup 完美地找到表单,它也会为表单打印 None 。有人有什么想法吗?

【问题讨论】:

    标签: python html beautifulsoup mechanize


    【解决方案1】:

    这样的事情会起作用:

        from bs4 import BeautifulSoup
    
        import mechanize
        import cookielib
    
        br = mechanize.Browser()
    
        cj = cookielib.LWPCookieJar()
    
        br.set_cookiejar(cj)
    
        host = 'https://order.papajohns.com/secure/signin/frame.html?destination=http%3a%2f%2forder.papajohns.com%2findex.html%3fsite%3dWEB%26dclid%3d%2525n-2543611-4121096-71899047-246709315-0%26esvt%3d336192-GOUSe339376223%26esvq%3dpapa%2520johns%26esvadt%3d999999-0-3934985-1%26esvcrea%3d41751468573%26esvplace%26esvd%3dc%26esvaid%3d30536%26gclid%3dCI2psOHqtbwCFRPxOgodr0gAAg'
    
        br.addheaders = [('User-agent', 'Firefox')]
        br.open(host)
        br.form = list(br.forms())[0]
        br.form['userName'] = username
        br.form['pwd'] = password
        submit = br.submit()
        code = response.read()
        soup = BeautifulSoup(code)
    

    【讨论】:

    • 这行得通,谢谢!知道为什么之前列出了表单吗?
    • 我猜它可能没有名字,所以 mechanize 通过list(br.forms()[0] 中的第一个表单识别它,其中 0 是页面上的第一个表单。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2011-01-15
    • 2016-05-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-04-21
    • 1970-01-01
    相关资源
    最近更新 更多