【问题标题】:How can I set cookie on Python Scrapy http request?如何在 Python Scrapy http 请求上设置 cookie?
【发布时间】:2020-09-26 09:04:22
【问题描述】:

我在 python scrapy 中有以下代码来抓取请求标头中带有 cookie 的网站。

headers = {
            'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36',
            'cookie': 'visid_incap_1388804=z0HpykCPTPOl91x1ZVgBin26ql4AAAAAQUIPAAAAAAAP3GOgoT20BZ/L8SMtjnDl; _gcl_au=1.1.924686404.1588247167; _ga=GA1.3.462736265.1588247167; _hjid=6cf736f7-6277-4326-9175-f50c1a373252; _fbp=fb.2.1588247167593.544254491; __gads=ID=a40d8232240136b3:T=1588247168:S=ALNI_MYjyYGgkvkfTsidqhzhkwQuLC9niQ; oth.sid=s%3AUDaLYz674-QEw8q_F0EBBqKxc1Fq8O6y.qs6ZhO%2B9TEJJgoYyDloSOD%2F0LXJOyiZ2CoMRbNjhXyc; nlbi_1388804=BeWWYqa/eg5iFqlA7jV9NwAAAACu/PEk+hkcYO5CjDDQsQRb; _gid=GA1.3.1321216902.1591447562; incap_ses_312_1388804=WK0DD6USuDnBjixreXJUBKuI3F4AAAAA6s4uCe5fmSv6TCkCePRobg==',
            'authority': 'www.onthehouse.com.au',
            'cache-control': 'max-age=0',
            'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
            'accept-language': 'en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7,fr;q=0.6,de;q=0.5',
            "cache-control": "no-cache",
            "pragma": "no-cache",
            "sec-fetch-dest": "document",
            "sec-fetch-mode": "navigate",
            "sec-fetch-site": "none",
            "sec-fetch-user": "?1",
            "upgrade-insecure-requests": "1",
            "referrerPolicy": "no-referrer-when-downgrade"
        }
        yield scrapy.http.Request(urls[0], headers=headers)

但是,在日志中我可以看到下面的Set-Cookie,它看起来与我请求中的不同:

DEBUG:scrapy.downloadermiddlewares.cookies:Received cookies from: <200 https://www.onthehouse.com.au/property-for-rent/vic/aspendale-gardens-3195>
Set-Cookie: visid_incap_1388804=RwmeJ/64TviesvhgjXZIQSDK3F4AAAAAQUIPAAAAAAD4Pp0pBTdJ3taXYbWHPW7O; expires=Mon, 07 Jun 2021 07:18:25 GMT; HttpOnly; path=/; Domain=.onthehouse.com.au

Set-Cookie: incap_ses_312_1388804=zn6FMsJVp1kAazNreXJUBHTK3F4AAAAAlo68QiIx5K4rzyjwMjMGaA==; path=/; Domain=.onthehouse.com.au

2020-06-07 21:07:33 [scrapy.downloadermiddlewares.cookies] DEBUG: Received cookies from: <200 https://www.onthehouse.com.au/property-for-rent/vic/aspendale-gardens-3195>
Set-Cookie: visid_incap_1388804=RwmeJ/64TviesvhgjXZIQSDK3F4AAAAAQUIPAAAAAAD4Pp0pBTdJ3taXYbWHPW7O; expires=Mon, 07 Jun 2021 07:18:25 GMT; HttpOnly; path=/; Domain=.onthehouse.com.au

Set-Cookie: incap_ses_312_1388804=zn6FMsJVp1kAazNreXJUBHTK3F4AAAAAlo68QiIx5K4rzyjwMjMGaA==; path=/; Domain=.onthehouse.com.au

我想知道我应该如何在请求中设置 cookie。

【问题讨论】:

    标签: python cookies scrapy


    【解决方案1】:

    您需要为您的Request 使用cookies 参数:

    Request(
        url="http://www.example.com",
        cookies={
            'visid_incap_1388804': 'z0HpykCPTPOl91x1ZVgBin26ql4AAAAAQUIPAAAAAAAP3GOgoT20BZ/L8SMtjnDl',
            'oth.sid': 's%3AUDaLYz674-QEw8q_F0EBBqKxc1Fq8O6y.qs6ZhO%2B9TEJJgoYyDloSOD%2F0LXJOyiZ2CoMRbNjhXyc',},
    )
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2011-06-24
      • 1970-01-01
      • 2017-03-10
      • 2014-08-23
      • 2012-03-13
      • 2017-12-22
      • 2021-04-11
      • 1970-01-01
      相关资源
      最近更新 更多