python Webscrape请求与硒答案

【问题标题】：python Webscrape requests vs seleniumpython Webscrape请求与硒
【发布时间】：2021-04-03 10:03:30
【问题描述】：

我不太清楚我在这里得到的回应。看起来不完整。我想知道这是否是错误的方法，应该改用硒。我正在尝试在这里获取所有菜单项以及价格和插件。我只是在检查响应。任何指导都会非常有帮助。谢谢

import requests
from bs4 import BeautifulSoup


url = 'https://www.swiggy.com/dapi/menu/v4/full?lat=12.9715987&lng=77.5945627&menuId=41100'
page = requests.get(url).json()
for k, v in page.items():
    print(v)

【问题讨论】：

第一次查看不是request 与selenium 的问题 - 请您改进您的问题并提供您预期的输出。

标签： python python-3.x selenium web-scraping python-requests

【解决方案1】：

试试这个：

import requests

url = 'https://www.swiggy.com/dapi/menu/v4/full?lat=12.9715987&lng=77.5945627&menuId=41100'
page = requests.get(url).json()
menu_items = page["data"]["menu"]["items"]

for k, v in menu_items.items():
    print("Name: {}".format(v["name"]))
    print("Description: {}".format(v["description"]))
    print("Price: {}".format(v["price"]))
    if "addons" in v.keys():
        for i in v["addons"]:
            print("\t{}".format(i["group_name"]))
            for j in i["choices"]:
                print("\t\tName: {}".format(j["name"]))
                print("\t\tPrice: {}".format(j["price"]))
                print()
    print()

【讨论】：

感谢您的回复。就一个问题。我能得到这道菜的描述吗，我想我在回复中看到了。只是我不太擅长json。
顺便说一下，'page' 变量只是一个嵌套字典。

【解决方案2】：

发生了什么事？

认为这不是 request 与 selenium 的网络抓取问题，因为您所做的只是请求 api 并获得 json 响应。

而且您的代码按照编写的方式运行良好，但您期望做的是其他事情，因此您必须以另一种方式处理迭代 dict。

这样就可以了：

page['data']['menu']['items'].items()

示例

import requests

url = 'https://www.swiggy.com/dapi/menu/v4/full?lat=12.9715987&lng=77.5945627&menuId=41100'
page = requests.get(url).json()

for k, v in page['data']['menu']['items'].items():
    print(v)

输出

{'id': 8194821, 'name': 'Egg Omelette', 'category': 'Quick Bites', 'description': '', 'cloudinaryImageId': '', 'recommended': 0, 'inStock': 0, 'isVeg': 0, 'enabled': 1, 'displayOrder': 0, 'price': 8500, 'variants_new': {'exclude_list': [], 'variant_groups': []}, 'addons': [{'group_id': 21778839, 'group_name': 'Addons', 'choices': [{'id': 17471659, 'name': 'Watermelon Juice', 'price': 10000, 'inStock': 0, 'isVeg': 1, 'order': 1, 'default': 0}, {'id': 17471658, 'name': 'Coke (750 ml)', 'price': 6000, 'inStock': 1, 'isVeg': 1, 'order': 1, 'default': 0}], 'maxAddons': -1, 'maxFreeAddons': -1, 'minAddons': 0, 'order': 1}, {'group_id': 21778838, 'group_name': 'Beverage Addons', 'choices': [{'id': 17471658, 'name': 'Coke (750 ml)', 'price': 6000, 'inStock': 1, 'isVeg': 1, 'order': 1, 'default': 0}], 'maxAddons': -1, 'maxFreeAddons': -1, 'minAddons': 0, 'order': 1}], 'cropChoices': 2, 'itemScore': 0, 'itemDiscount': 0, 'isPopular': 0, 'restId': '41100', 'showMC': 0, 'ribbon': {'text': 'Must Try', 'textColor': '#ffffff', 'topBackgroundColor': '#d53d4c', 'bottomBackgroundColor': '#b02331'}, 'attributes': {'portionSize': '', 'vegClassifier': 'EGG', 'accompaniments': ''}, 'itemNudgeType': ''}
{'id': 8194822, 'name': 'Paneer Sholey', 'category': 'Starters', 'description': 'Fried cottage cheese cubes, tossed in signature spices & curry leaves garnish.', 'cloudinaryImageId': '', 'recommended': 0, 'inStock': 0, 'isVeg': 1, 'enabled': 1, 'displayOrder': 0, 'price': 22000, 'variants_new': {'exclude_list': [], 'variant_groups': []}, 'cropChoices': 2, 'itemScore': 0, 'itemDiscount': 0, 'isPopular': 0, 'restId': '41100', 'showMC': 0, 'attributes': {'portionSize': '', 'vegClassifier': 'VEG', 'accompaniments': ''}, 'itemNudgeType': ''}
...

【讨论】：

感谢您的回复。如果你能看一眼上面的答案。我正在寻找这道菜的描述。我对json 很虚弱。非常感谢，
只获取描述：print(v.get('description')) 而不是print(v)
返回None ..我们使用get吗？
这行得通。 ` print("Description: {}".format(v["description"]))`
你可以同时使用，这取决于你之前做了什么：for k,v in page['data']['menu']['items'].items(): print(v['description'])