如何从带有加载屏幕的网页中检索数据？答案

【问题标题】：How can I retrieve data from a web page with a loading screen?如何从带有加载屏幕的网页中检索数据？
【发布时间】：2020-06-09 13:50:08
【问题描述】：

我正在使用 requests 库从 nitrotype.com/racer/insert_name_here 检索有关用户进度的数据，使用以下代码：

import requests

base_url = 'https://www.nitrotype.com/racer/'
name = 'test'
url = base_url + name

page = requests.get(url)
print(page.text)

但是我的问题是这会从加载屏幕中检索数据，我想要加载屏幕之后的数据。是否有可能做到这一点以及如何做到这一点？

【问题讨论】：

标签： python-3.x web-scraping python-requests

【解决方案1】：

这可能是因为动态加载，可以使用 selenium 或 pyppeteer 轻松导航。

在我的示例中，我使用 pyppeteer 生成浏览器并加载 javascript，以便获得所需的信息。

例子：

import pyppeteer
import asyncio

async def main():
    # launches a chromium browser, can use chrome instead of chromium as well.
    browser = await pyppeteer.launch(headless=False)
    # creates a blank page
    page = await browser.newPage()
    # follows to the requested page and runs the dynamic code on the site.
    await page.goto('https://www.nitrotype.com/racer/tupac')
    # provides the html content of the page
    cont = await page.content()
    return cont

# prints the html code of the user profiel: tupac
print(asyncio.get_event_loop().run_until_complete(main()))

【讨论】：

我收到此错误：回溯（最近一次调用最后一次）：文件“main.py”，第 16 行，在 print(asyncio.get_event_loop().run_until_complete()) 类型错误：run_until_complete () 缺少 1 个必需的位置参数：“未来”