【问题标题】:How to web scrape and access <script> wtih bs4 using python如何使用 python 使用 bs4 抓取和访问 <script>
【发布时间】:2018-07-16 20:47:44
【问题描述】:

我正在尝试从“https://www.deadstock.ca/products/adidas-futurepacer-grey-one”中抓取数据

我希望能够读取如下所示的变体数据:

<script>
window.ShopifyAnalytics = window.ShopifyAnalytics || {};
window.ShopifyAnalytics.meta = window.ShopifyAnalytics.meta || {};
  window.ShopifyAnalytics.meta.currency = 'CAD';
  var meta = {"product":{"id":223724142613,"vendor":"Adidas","type":"Footwear 
- QS","variants":[{"id":3063231774741,"price":26000,"name":"adidas 
Futurepacer \/ Grey One - 8","public_title":"8","sku":"AQ0907-Grey One-8"}, 
{"id":3063231807509,"price":26000,"name":"adidas Futurepacer \/ Grey One - 
8.5","public_title":"8.5","sku":"AQ0907-Grey One-8.5"}, 
{"id":3063231840277,"price":26000,"name":"adidas Futurepacer \/ Grey One - 
9","public_title":"9","sku":"AQ0907-Grey One-9"}, 
{"id":3063231873045,"price":26000,"name":"adidas Futurepacer \/ Grey One - 
9.5","public_title":"9.5","sku":"AQ0907-Grey One-9.5"}, 
{"id":3063231905813,"price":26000,"name":"adidas Futurepacer \/ Grey One - 
10","public_title":"10","sku":"AQ0907-Grey One-10"}, 
{"id":3063231938581,"price":26000,"name":"adidas Futurepacer \/ Grey One - 
10.5","public_title":"10.5","sku":"AQ0907-Grey One-10.5"}, 
{"id":3063231971349,"price":26000,"name":"adidas Futurepacer \/ Grey One - 
11","public_title":"11","sku":"AQ0907-Grey One-11"}, 
{"id":3063232004117,"price":26000,"name":"adidas Futurepacer \/ Grey One - 
12","public_title":"12","sku":"AQ0907-Grey One-12"}, 
{"id":3063232036885,"price":26000,"name":"adidas Futurepacer \/ Grey One - 
 13","public_title":"13","sku":"AQ0907-Grey One-13"}]},"page": 
{"pageType":"product","resourceType":"product","resourceId":223724142613}};
  for (var attr in meta) {
    window.ShopifyAnalytics.meta[attr] = meta[attr];
  }
</script>

我认为我的目标不正确。我希望代码能够打印所有“id”:数字。到目前为止,这是我的代码,我对 bs4 还是新手,但我们将不胜感激。谢谢

import bs4 as bs
import urllib.request
import lxml

link = urllib.request.urlopen ('https://www.deadstock.ca/products/adidas-futurepacer-grey-one').read()

soup = bs.BeautifulSoup(link,'lxml')

for variants in soup.find_all('script'):
    print (variants)

我在@Andrej Kesely 回答的基础上尝试了类似的方法。

for id in data['variants']:
size = id['variants']['option1']
variantid = id['variants']['id']

print (size)
print (variantid)

虽然它返回一个 keyerror,但我只是希望它显示所有 ID

【问题讨论】:

    标签: python web beautifulsoup scrape


    【解决方案1】:

    对于此网站,您需要使用id="ProductJson-product-template" 定位。它包含所有变体的 JSON:

    import bs4 as bs
    import urllib.request
    import json
    
    link = urllib.request.urlopen('https://www.deadstock.ca/products/adidas-futurepacer-grey-one').read()
    
    soup = bs.BeautifulSoup(link,'lxml')
    
    variant = soup.find('script', id='ProductJson-product-template')
    data = json.loads(variant.text)
    
    print(json.dumps(data, indent=4, sort_keys=True))
    

    打印:

    {
        "available": true,
        "compare_at_price": null,
        "compare_at_price_max": 0,
        "compare_at_price_min": 0,
        "compare_at_price_varies": false,
        "content": "<p>adidas digs into its vault to redesign the 1984 Micropacer, a computerized shoe that was designed to track running statistics. Today, the adidas Futurepacer takes cues from the Micropacer, and the more recent NMD collection, to deliver a futuristic lifestyle runner. Featuring a premium leather upper, complete with a modernized lace cover, the adidas Futurepacer sits on a Boost midsole for lightweight cushioning and responsiveness. NMD inspired bumpers are found at the heel and forefoot for added style and stability.</p>\n<ul>\n<li>Premium leather upper</li>\n<li>Premium leather lace cover</li>\n<li>Subtle adidas branding</li>\n<li>Boost midsole</li>\n<li>adidas midsole plugs</li>\n<li>Grey One</li>\n</ul>\n<p>Product Code: AQ0907</p>",
        "created_at": "2018-03-05T14:23:57-08:00",
        "description": "<p>adidas digs into its vault to redesign the 1984 Micropacer, a computerized shoe that was designed to track running statistics. Today, the adidas Futurepacer takes cues from the Micropacer, and the more recent NMD collection, to deliver a futuristic lifestyle runner. Featuring a premium leather upper, complete with a modernized lace cover, the adidas Futurepacer sits on a Boost midsole for lightweight cushioning and responsiveness. NMD inspired bumpers are found at the heel and forefoot for added style and stability.</p>\n<ul>\n<li>Premium leather upper</li>\n<li>Premium leather lace cover</li>\n<li>Subtle adidas branding</li>\n<li>Boost midsole</li>\n<li>adidas midsole plugs</li>\n<li>Grey One</li>\n</ul>\n<p>Product Code: AQ0907</p>",
        "featured_image": "//cdn.shopify.com/s/files/1/0616/3517/products/aq0907_adidas_futurepacer_grey_one.jpg?v=1530313621",
        "handle": "adidas-futurepacer-grey-one",
        "id": 223724142613,
        "images": [
            "//cdn.shopify.com/s/files/1/0616/3517/products/aq0907_adidas_futurepacer_grey_one.jpg?v=1530313621",
            "//cdn.shopify.com/s/files/1/0616/3517/products/aq0907_adidas_futurepacer_grey_one_1.jpg?v=1531247436",
            "//cdn.shopify.com/s/files/1/0616/3517/products/aq0907_adidas_futurepacer_grey_one_2.jpg?v=1531247441",
            "//cdn.shopify.com/s/files/1/0616/3517/products/aq0907_adidas_futurepacer_grey_one_3.jpg?v=1531247444",
            "//cdn.shopify.com/s/files/1/0616/3517/products/aq0907_adidas_futurepacer_grey_one_4.jpg?v=1531247447",
            "//cdn.shopify.com/s/files/1/0616/3517/products/aq0907_adidas_futurepacer_grey_one_5.jpg?v=1531247449"
        ],
        "options": [
            "US Size"
        ],
        "price": 26000,
        "price_max": 26000,
        "price_min": 26000,
        "price_varies": false,
        "published_at": "2018-07-14T12:00:00-07:00",
        "tags": [
            "07142018",
            "cf-type-footwear-qs",
            "cf-us-size-10",
            "cf-us-size-10-5",
            "cf-us-size-11",
            "cf-us-size-12",
            "cf-us-size-13",
            "cf-us-size-8",
            "cf-us-size-8-5",
            "cf-us-size-9",
            "cf-us-size-9-5",
            "cf-vendor-adidas",
            "free_shipping",
            "limit-quantity",
            "plsmerch"
        ],
        "title": "adidas Futurepacer / Grey One",
        "type": "Footwear - QS",
        "variants": [
            {
                "available": true,
                "barcode": "193050061142",
                "compare_at_price": null,
                "featured_image": null,
                "id": 3063231774741,
                "inventory_management": "shopify",
                "inventory_policy": "deny",
                "inventory_quantity": 3,
                "name": "adidas Futurepacer / Grey One - 8",
                "option1": "8",
                "option2": null,
                "option3": null,
                "options": [
                    "8"
                ],
                "price": 26000,
                "public_title": "8",
                "requires_shipping": true,
                "sku": "AQ0907-Grey One-8",
                "taxable": true,
                "title": "8",
                "weight": 0
            },
            {
                "available": true,
                "barcode": "193050061159",
                "compare_at_price": null,
                "featured_image": null,
                "id": 3063231807509,
                "inventory_management": "shopify",
                "inventory_policy": "deny",
                "inventory_quantity": 2,
                "name": "adidas Futurepacer / Grey One - 8.5",
                "option1": "8.5",
                "option2": null,
                "option3": null,
                "options": [
                    "8.5"
                ],
                "price": 26000,
                "public_title": "8.5",
                "requires_shipping": true,
                "sku": "AQ0907-Grey One-8.5",
                "taxable": true,
                "title": "8.5",
                "weight": 0
            },
            {
                "available": true,
                "barcode": "193050061166",
                "compare_at_price": null,
                "featured_image": null,
                "id": 3063231840277,
                "inventory_management": "shopify",
                "inventory_policy": "deny",
                "inventory_quantity": 6,
                "name": "adidas Futurepacer / Grey One - 9",
                "option1": "9",
                "option2": null,
                "option3": null,
                "options": [
                    "9"
                ],
                "price": 26000,
                "public_title": "9",
                "requires_shipping": true,
                "sku": "AQ0907-Grey One-9",
                "taxable": true,
                "title": "9",
                "weight": 0
            },
            {
                "available": true,
                "barcode": "193050061173",
                "compare_at_price": null,
                "featured_image": null,
                "id": 3063231873045,
                "inventory_management": "shopify",
                "inventory_policy": "deny",
                "inventory_quantity": 5,
                "name": "adidas Futurepacer / Grey One - 9.5",
                "option1": "9.5",
                "option2": null,
                "option3": null,
                "options": [
                    "9.5"
                ],
                "price": 26000,
                "public_title": "9.5",
                "requires_shipping": true,
                "sku": "AQ0907-Grey One-9.5",
                "taxable": true,
                "title": "9.5",
                "weight": 0
            },
            {
                "available": true,
                "barcode": "193050061180",
                "compare_at_price": null,
                "featured_image": null,
                "id": 3063231905813,
                "inventory_management": "shopify",
                "inventory_policy": "deny",
                "inventory_quantity": 6,
                "name": "adidas Futurepacer / Grey One - 10",
                "option1": "10",
                "option2": null,
                "option3": null,
                "options": [
                    "10"
                ],
                "price": 26000,
                "public_title": "10",
                "requires_shipping": true,
                "sku": "AQ0907-Grey One-10",
                "taxable": true,
                "title": "10",
                "weight": 0
            },
            {
                "available": true,
                "barcode": "193050061197",
                "compare_at_price": null,
                "featured_image": null,
                "id": 3063231938581,
                "inventory_management": "shopify",
                "inventory_policy": "deny",
                "inventory_quantity": 6,
                "name": "adidas Futurepacer / Grey One - 10.5",
                "option1": "10.5",
                "option2": null,
                "option3": null,
                "options": [
                    "10.5"
                ],
                "price": 26000,
                "public_title": "10.5",
                "requires_shipping": true,
                "sku": "AQ0907-Grey One-10.5",
                "taxable": true,
                "title": "10.5",
                "weight": 0
            },
            {
                "available": true,
                "barcode": "193050061203",
                "compare_at_price": null,
                "featured_image": null,
                "id": 3063231971349,
                "inventory_management": "shopify",
                "inventory_policy": "deny",
                "inventory_quantity": 1,
                "name": "adidas Futurepacer / Grey One - 11",
                "option1": "11",
                "option2": null,
                "option3": null,
                "options": [
                    "11"
                ],
                "price": 26000,
                "public_title": "11",
                "requires_shipping": true,
                "sku": "AQ0907-Grey One-11",
                "taxable": true,
                "title": "11",
                "weight": 0
            },
            {
                "available": true,
                "barcode": "193050061210",
                "compare_at_price": null,
                "featured_image": null,
                "id": 3063232004117,
                "inventory_management": "shopify",
                "inventory_policy": "deny",
                "inventory_quantity": 4,
                "name": "adidas Futurepacer / Grey One - 12",
                "option1": "12",
                "option2": null,
                "option3": null,
                "options": [
                    "12"
                ],
                "price": 26000,
                "public_title": "12",
                "requires_shipping": true,
                "sku": "AQ0907-Grey One-12",
                "taxable": true,
                "title": "12",
                "weight": 0
            },
            {
                "available": true,
                "barcode": "193050061227",
                "compare_at_price": null,
                "featured_image": null,
                "id": 3063232036885,
                "inventory_management": "shopify",
                "inventory_policy": "deny",
                "inventory_quantity": 1,
                "name": "adidas Futurepacer / Grey One - 13",
                "option1": "13",
                "option2": null,
                "option3": null,
                "options": [
                    "13"
                ],
                "price": 26000,
                "public_title": "13",
                "requires_shipping": true,
                "sku": "AQ0907-Grey One-13",
                "taxable": true,
                "title": "13",
                "weight": 0
            }
        ],
        "vendor": "Adidas"
    }
    

    【讨论】:

    • 太棒了,现在从该代码中,我如何仅针对每个变体中的 ID 进行定位?
    • @TNC 数据只是一个经典的 Python 字典,因此您可以像使用另一个字典一样使用它。
    • 我刚刚尝试过这样的事情,但不确定我是否走在正确的道路上。编辑过的帖子。
    猜你喜欢
    • 2013-12-04
    • 2017-09-20
    • 1970-01-01
    • 2021-04-21
    • 2019-03-08
    • 1970-01-01
    • 1970-01-01
    • 2021-05-10
    • 1970-01-01
    相关资源
    最近更新 更多