【问题标题】:An error occured during an HTTP request: HTTP Error 404: Not Found GetOldTweets3HTTP 请求期间发生错误:HTTP 错误 404:未找到 GetOldTweets3
【发布时间】:2021-01-05 01:57:10
【问题描述】:

当我尝试根据用户名抓取推文时,它会显示给我 "在 HTTP 请求期间发生错误:HTTP 错误 404:未找到。尝试在浏览器中打开:https://twitter.com/search?q=%20from%3A3mindia%20since%3A2020-04-01%20until%3A2020-04-30&src=typd"

给出的链接将我引导到不是 404 的网站。


    An error occured during an HTTP request: HTTP Error 404: Not Found
    Try to open in browser: https://twitter.com/search?q=%20from%3A3mindia%20since%3A2020-04 01%20until%3A2020-04-30&src=typd
Traceback (most recent call last):
  File "C:\Users\\anaconda3\lib\site-packages\GetOldTweets3\manager\TweetManager.py", line 343, in getJsonResponse
    response = opener.open(url)
  File "C:\Users\\anaconda3\lib\urllib\request.py", line 531, in open
    response = meth(req, response)
  File "C:\Users\\anaconda3\lib\urllib\request.py", line 641, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Users\\anaconda3\lib\urllib\request.py", line 569, in error
    return self._call_chain(*args)
  File "C:\Users\\anaconda3\lib\urllib\request.py", line 503, in _call_chain
    result = func(*args)
  File "C:\Users\\anaconda3\lib\urllib\request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\\anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3331, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-41ed802d4e96>", line 1, in <module>
    CleanTweets("3mindia", "2020-04-01", "2020-04-30", "3M India")
  File "<ipython-input-2-1fa226d02b36>", line 3, in CleanTweets
    tweets = got.manager.TweetManager.getTweets(tweetCriteria)
  File "C:\Users\\anaconda3\lib\site-packages\GetOldTweets3\manager\TweetManager.py", line 65, in getTweets
    json = TweetManager.getJsonResponse(tweetCriteria, refreshCursor, cookieJar, proxy, user_agent, debug=debug)
  File "C:\Users\\anaconda3\lib\site-packages\GetOldTweets3\manager\TweetManager.py", line 348, in getJsonResponse
    sys.exit()
SystemExit

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\\anaconda3\lib\site-packages\IPython\core\ultratb.py", line 1151, in get_records
    return _fixed_getinnerframes(etb, number_of_lines_of_context, tb_offset)
  File "C:\Users\Monis\anaconda3\lib\site-packages\IPython\core\ultratb.py", line 319, in wrapped
    return f(*args, **kwargs)
  File "C:\Users\\anaconda3\lib\site-packages\IPython\core\ultratb.py", line 353, in _fixed_getinnerframes
    records = fix_frame_records_filenames(inspect.getinnerframes(etb, context))
  File "C:\Users\\anaconda3\lib\inspect.py", line 1502, in getinnerframes
    frameinfo = (tb.tb_frame,) + getframeinfo(tb, context)
AttributeError: 'tuple' object has no attribute 'tb_frame'
An exception has occurred, use %tb to see the full traceback.

---------------------------------------------------------------------------

During handling of the above exception, another exception occurred:

SystemExit

【问题讨论】:

    标签: python web-scraping twitter


    【解决方案1】:

    有关错误的详细信息,请参阅此内容:

    https://github.com/Mottl/GetOldTweets3/issues/98

    twitter 可能已经删除了 GetOldTweets3 - https://twitter.com/i/search/timeline? 的端点。

    另请参阅此帖子:Why getoldtweets3 library provides 404 error?

    【讨论】:

      【解决方案2】:

      一位用户似乎拥有changed the code from the GOT3 library 并设法解决了该问题。唯一的问题是生成的 HTML 未格式化。所以有人需要为此努力。具体来说,他们提到的变化是:

      updated user_agents (updated with the ones used by TWINT);
      
      updated endpoint (/search?)
      
      some updates to the URL structure:
      
      url = "https://twitter.com/search?"
      
          
      
      url += ("q=%%20%s&src=typd%s"
              "&include_available_features=1&include_entities=1&max_position=%s"
              "&reset_error_state=false")
      
      if not tweetCriteria.topTweets:
          url += "&f=live"`
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2014-01-06
        • 1970-01-01
        • 2019-10-14
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2022-06-10
        相关资源
        最近更新 更多