【发布时间】:2020-11-30 14:09:49
【问题描述】:
当使用 POST 方法从 tokopedia.com 获取数据时,我得到的 response.json() 为:
[{'errors': [{'message': 'Request not allowed', 'extensions': {}}]}]
我尝试通过将“https”替换为“http”来保存它,这在这种情况下曾经有效,并使用 JavaScript fetch 来完成相同的工作。两者都未能解决此问题。
这是我的标题:
headers = {
'user-agent': "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.67 Safari/537.36",
'cookie': cookie,
'origin': 'https://www.tokopedia.com',
'content-type': 'application/json',
'accept': '*/*',
'accept-encoding': 'gzip, deflate, br',
'referer': 'https://www.tokopedia.com/p/dapur/aksesoris-dapur',
'content-length': '25794'
};
这是来自 Chrome 的 DevTools 的标头:
:authority: gql.tokopedia.com
:method: POST
:path: /
:scheme: https
accept: */*
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9
content-length: 3206
content-type: application/json
cookie: cookie
origin: https://www.tokopedia.com
referer: https://www.tokopedia.com/p/dapur/aksesoris-dapur
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: same-site
tkpd-userid: 0
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.67 Safari/537.36
x-device: desktop-0.0
x-source: tokopedia-lite
x-tkpd-lite-service: zeus
这是我的代码的主要部分:
import requests
headers = headers
payload = [{'operationName': 'SearchProductQuery', 'variables': {'params': '&ob=23&identifier=dapur_aksesoris-dapur&sc=3439&user_id=0&rows=60&start=1&source=directory&device=desktop&page=1&related=true&st=product&safe_search=false', 'adParams': '&page=1&dep_id=3439&ob=23&ep=product&item=15&src=directory&device=desktop&user_id=0&minimum_item=15&start=1&no_autofill_range=5-14'}, 'query': 'query SearchProductQuery($params: String, $adParams: String) {\n CategoryProducts: searchProduct(params: $params) {\n count\n data: products {\n id\n url\n imageUrl: image_url\n imageUrlLarge: image_url_700\n catId: category_id\n gaKey: ga_key\n countReview: count_review\n discountPercentage: discount_percentage\n preorder: is_preorder\n name\n price\n original_price\n rating\n wishlist\n labels {\n title\n color\n __typename\n }\n badges {\n imageUrl: image_url\n show\n __typename\n }\n shop {\n id\n url\n name\n goldmerchant: is_power_badge\n official: is_official\n reputation\n clover\n location\n __typename\n }\n labelGroups: label_groups {\n position\n title\n type\n __typename\n }\n __typename\n }\n __typename\n }\n displayAdsV3(displayParams: $adParams) {\n data {\n id\n ad_ref_key\n redirect\n sticker_id\n sticker_image\n productWishListUrl: product_wishlist_url\n clickTrackUrl: product_click_url\n shop_click_url\n product {\n id\n name\n wishlist\n image {\n imageUrl: s_ecs\n trackerImageUrl: s_url\n __typename\n }\n url: uri\n relative_uri\n price: price_format\n campaign {\n original_price\n discountPercentage: discount_percentage\n __typename\n }\n wholeSalePrice: wholesale_price {\n quantityMin: quantity_min_format\n quantityMax: quantity_max_format\n price: price_format\n __typename\n }\n count_talk_format\n countReview: count_review_format\n category {\n id\n __typename\n }\n preorder: product_preorder\n product_wholesale\n free_return\n isNewProduct: product_new_label\n cashback: product_cashback_rate\n rating: product_rating\n top_label\n bottomLabel: bottom_label\n __typename\n }\n shop {\n image_product {\n image_url\n __typename\n }\n id\n name\n domain\n location\n city\n tagline\n goldmerchant: gold_shop\n gold_shop_badge\n official: shop_is_official\n lucky_shop\n uri\n owner_id\n is_owner\n badges {\n title\n image_url\n show\n __typename\n }\n __typename\n }\n applinks\n __typename\n }\n template {\n isAd: is_ad\n __typename\n }\n __typename\n }\n}\n'}]
res = requests.post('https://gql.tokopedia.com/', headers=headers, data=payload)
【问题讨论】:
-
内容长度好像真的不一样。
-
另外,“data=payload”参数尝试“json=payload”。不过,我不熟悉 tokopedia 或他们的 API。正如@Crapy 所说,内容长度看起来很奇怪。也许把它拿出来。
-
不,这不起作用。我使用来自源网站的 Chrome DevTools 发送了一个请求,并且响应是正确的。
-
但问题仍然是我如何模拟请求标头中所述的来源?
-
耶稣米凯拉的方法突然奏效了!也许是因为我停止使用学校的 Wi-Fi,但它确实有效!
标签: python python-requests web-crawler fetch