【问题标题】:how do i fix this name is not defined error?如何修复此名称未定义错误?
【发布时间】:2020-05-22 10:09:27
【问题描述】:
from bs4 import BeautifulSoup


def us_30():
    page = session.get('https://www.investing.com/indices/us-30-technical')
    soup = BeautifulSoup(page.content, 'html.parser')
    summary = soup.find(id="techStudiesInnerWrap")
    print(summary.div.text)
    name = soup.find("td", class_="first left symbol", string="RSI(14)")
    value = name.find_next('td')
    action = value.find_next('td')
    print(f"Name: {name.text}. Value:{value.text}. Action: {action.span.text}")


us_30()

我正在尝试从网站获取 rsi 值

错误:

【问题讨论】:

    标签: python python-3.x web-scraping


    【解决方案1】:

    你需要创建一个请求会话:

    import requests
    from bs4 import BeautifulSoup
    
    
    def us_30():
        session = requests.Session()
        page = session.get('https://www.investing.com/indices/us-30-technical')
        soup = BeautifulSoup(page.content, 'html.parser')
        print(soup)
        summary = soup.find(id="techStudiesInnerWrap")
        print(summary.div.text)
        name = soup.find("td", class_="first left symbol", string="RSI(14)")
        value = name.find_next('td')
        action = value.find_next('td')
        print(f"Name: {name.text}. Value:{value.text}. Action: {action.span.text}")
    
    
    us_30()
    

    输出:

    <?xml version="1.0" encoding="utf-8"?>
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
    
    <html>
    <head>
    <title>403 You are banned from this site.  Please contact via a different client configuration if you believe that this is a mistake.</title>
    </head>
    <body>
    <h1>Error 403 You are banned from this site.  Please contact via a different client configuration if you believe that this is a mistake.</h1>
    <p>You are banned from this site.  Please contact via a different client configuration if you believe that this is a mistake.</p>
    <h3>Guru Meditation:</h3>
    <p>XID: 1557864559</p>
    <hr/>
    <p>Varnish cache server</p>
    </body>
    </html>
    
    Traceback (most recent call last):
      File "x.py", line 18, in <module>
        us_30()
      File "x.py", line 11, in us_30
        print(summary.div.text)
    AttributeError: 'NoneType' object has no attribute 'div'
    

    现在,你只需要弄清楚如何不被禁止 :)

    【讨论】:

      【解决方案2】:

      您似乎还没有定义session 变量是什么。 如果你使用requests 模块,我想你会有类似的东西

      import requests
      requests.get(url)
      

      或者我想你正在使用 selenium 会话。请改正。

      【讨论】:

      • "C:\Users\Joao Fernandes\PycharmProjects\untitled\venv\Scripts\python.exe" "C:/Users/Joao Fernandes/PycharmProjects/untitled/main.py" Traceback(最近最后调用):文件“C:/Users/Joao Fernandes/PycharmProjects/untitled/main.py”,第 16 行,在 us_30() 文件“C:/Users/Joao Fernandes/PycharmProjects/untitled/main.py ",第 9 行,在 us_30 print(summary.div.text) AttributeError: 'NoneType' object has no attribute 'div' Process finished with exit code 1
      • @Shrutheesh Raman:导入 requests 并不能解决问题。
      • @MauriceMeyer,我只是举了requests 的例子。这取决于您将进行哪种网络抓取。
      • 我注意到您希望从中抓取的网页嵌入了 javascript。因此,您必须先对零件进行分类。您收到Nonetype 错误,因为一开始没有正确下载数据
      猜你喜欢
      • 2019-06-13
      • 1970-01-01
      • 2019-09-26
      • 1970-01-01
      • 2021-05-12
      • 1970-01-01
      • 1970-01-01
      • 2019-09-07
      • 1970-01-01
      相关资源
      最近更新 更多