为什么我无法在 Playwright 中获取 cookie 值？答案

【问题标题】：Why can't I get cookie value in Playwright?为什么我无法在 Playwright 中获取 cookie 值？
【发布时间】：2022-08-24 14:16:25
【问题描述】：

首先，对不起我的英语不好

我想用剧作家来弄饼干，但我不能。我尝试了 3 种我找到的方法，但一无所获。

使用page.on

page.on(\'request\',get_cookie)
page.on(\'response\',get_cookie)

def get_cookie(request):
    allheaders = request.all_headers()
    print(allheaders)


>>>
{\'accept-ranges\': \'bytes\', \'age\': \'9576\', \'cache-control\': \'max-age=600\', \'content-length\': \'6745\', \'content-type\': \'image/png\', \'date\': \'Thu, 30 Jun 2022 01:09:20 GMT\', \'etag\': \'\"206578bcab2ad71:0\"\', \'expires\': \'Thu, 30 Jun 2022 01:19:20 GMT\', \'last-modified\': \'Tue, 06 Apr 2021 06:11:52 GMT\', \'server\': \'NWS_SPMid\', \'x-cache-lookup\': \'Cache Hit\', \'x-daa-tunnel\': \'hop_count=1\', \'x-nws-log-uuid\': \'16892018456232999193\', \'x-powered-by\': \'ASP.NET\'}
{\'accept-ranges\': \'bytes\', \'age\': \'9576\', \'cache-control\': \'max-age=600\', \'content-length\': \'6745\', \'content-type\': \'image/png\', \'date\': \'Thu, 30 Jun 2022 01:09:20 GMT\', \'etag\': \'\"206578bcab2ad71:0\"\', \'expires\': \'Thu, 30 Jun 2022 01:19:20 GMT\', \'last-modified\': \'Tue, 06 Apr 2021 06:11:52 GMT\', \'server\': \'NWS_SPMid\', \'x-cache-lookup\': \'Cache Hit\', \'x-daa-tunnel\': \'hop_count=1\', \'x-nws-log-uuid\': \'16892018456232999193\', \'x-powered-by\': \'ASP.NET\'}
...(and more like this)

返回了一些东西，但这里没有 cookie

使用browser_context.cookies 已解决！感谢@Charchit

context = browser.new_context();
page = context.new_page()
page.goto(url)
cookies = context.cookies
print(cookies)

>>>
<bound method BrowserContext.cookies of <BrowserContext browser=<Browser type=<BrowserType name=chromium executable_path=/Users/swong/Library/Caches/ms-playwright/chromium-1005/chrome-mac/Chromium.app/Contents/MacOS/Chromium> version=102.0.5005.40>>>

使用 JS

cookie = page.evaluate(\'console.log(document.cookie)\')
print(cookie)

>>>
None

我从 Chromium 页面打开了网络选项卡，在 Requests\' 标头中有我想要的 cookie。

请帮助我，谢谢大家！

这是我的代码示例。该网站是中文的，希望您不要介意。这只是一个简单的登录页面。

from playwright.sync_api import sync_playwright

url = \'https://so.gushiwen.cn/user/login.aspx\'

def get_cookie(request_or_reqponse):
    headersArray = request_or_reqponse.headers_array()
    print(\'「headersArray」：\', headersArray)


with sync_playwright() as p:
    browser = p.chromium.launch(headless=False)
    context = browser.new_context();
    page = context.new_page()

    page.goto(url)
    page.fill(\'#email\',\'6j3y4ecy@spymail.one\')
    page.fill(\'#pwd\', \'6j3y4ecy@spymail.one\')

    page.wait_for_timeout(5000) # imput the captcha code manually

    page.on(\'request\',get_cookie)
    page.on(\'response\',get_cookie)

    print(\'loging in...\')
    page.click(\'#denglu\')

    page.wait_for_timeout(50000) # wait for nothing

    browser.close()

您可以使用 URL 创建minimal reproducible example，以便我们可以复制和测试代码。
解决它。对不起，我是新人。

标签： python cookies scrapy playwright

【解决方案1】：

在第二种方法中，将cookies = context.cookies 更改为cookies = context.cookies()。这是一个方法，你需要调用它。检查 documentation：

context = browser.new_context();
page = context.new_page()
page.goto(url)
cookies = context.cookies()
print(cookies)

此外，不建议像第一种方法那样做。这是因为即使您从响应中获得 Cookie 标头，您也无法真正在其他地方存储和使用它，除非您使用工厂函数或全局变量。此外，当BrowserContext 专门有一个方法时，为什么要这样做:)

编辑

第一种方法似乎不起作用的原因是因为它返回请求的标头和做出的响应。 Cookie 也可以通过页面本身的 javascript 创建，这些可能根本不会显示在标题中。

其次，从您为问题中的第一种方法打印出来的标题来看，它似乎只针对一个请求。运行您的代码后，收到了更多的请求和响应，并打印出更多的标头。特别是从响应中，您可以通过搜索标头 'set-cookie' 来检索服务器设置的 cookie。

【讨论】：

谢谢！我知道了。但与控制台中的 cookie 略有不同。无论如何，这个 cookie 有效，再次感谢。并且......虽然第一种方法不可取，但我仍然很好奇为什么它不能得到正确的cookie :(
@SSjoewvv 检查我的编辑
是的，是的，我也收到了很多使用page.on 的请求和回复。我希望页面清晰，所以只需发布一份退货。抱歉，这会造成一些误解，非常感谢您的编辑内容。