【发布时间】:2022-10-04 16:24:03
【问题描述】:
访问此website 后,当我用Miami, FL 填写输入框(City or zip)并点击搜索按钮时,我可以看到该站点上显示的相关结果。
我希望使用 requests 模块来模仿相同的内容。我尝试按照开发工具中显示的步骤进行操作,但由于某种原因,下面的脚本会出现此输出You are not authorized to access this request。
我尝试过:
import json
import requests
from pprint import pprint
from bs4 import BeautifulSoup
URL = \"https://www.realtor.com/realestateagents/\"
link = \'https://www.realtor.com/realestateagents/api/v3/search\'
headers = {
\'User-Agent\': \'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36\',
\'Accept\': \'application/json, text/plain, */*\',
\'referer\': \'https://www.realtor.com/realestateagents/\',
\'accept-encoding\': \'gzip, deflate, br\',
\'accept-language\': \'en-US,en;q=0.9,bn;q=0.8\',
\'X-Requested-With\': \'XMLHttpRequest\',
\'x-newrelic-id\': \'VwEPVF5XGwQHXFNTBAcAUQ==\',
\'authorization\': \'Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE2NjQ1MjU0NDQsInN1YiI6ImZpbmRfYV9yZWFsdG9yIiwiaWF0IjoxNjY0NTI0Nzk2fQ.Q2jryTAD5vgsJ37e1SylBnkaeK7Cln930Q8KL4ANqsM\'
}
params = {
\'nar_only\': \'1\',
\'offset\': \'\',
\'limit\': \'20\',
\'marketing_area_cities\': \'FL_Miami\',
\'postal_code\': \'\',
\'is_postal_search\': \'true\',
\'name\': \'\',
\'types\': \'agent\',
\'sort\': \'recent_activity_high\',
\'far_opt_out\': \'false\',
\'client_id\': \'FAR2.0\',
\'recommendations_count_min\': \'\',
\'agent_rating_min\': \'\',
\'languages\': \'\',
\'agent_type\': \'\',
\'price_min\': \'\',
\'price_max\': \'\',
\'designations\': \'\',
\'photo\': \'true\',
\'seoUserType\': \"{\'isBot\':\'false\',\'deviceType\':\'desktop\'}\",
\'is_county_search\': \'false\',
\'county\': \'\'
}
with requests.Session() as s:
s.headers.update(headers)
res = s.get(link,params=params)
print(res.status_code)
print(res.json())
-
请确认您的访问令牌是否未过期
-
几天前创建这篇文章时,我直接从开发工具收集了授权令牌。我没有找到任何自动更新令牌的方法。但是,很难说令牌是否仍然有效。当我运行脚本时,我收到状态 200。
-
认为您应该再次尝试收集授权令牌,然后重试
标签: python python-3.x web-scraping python-requests