【发布时间】:2020-12-07 05:56:12
【问题描述】:
我正在尝试获取这些脚本标签中的数据,但我似乎无法转换为 json,因此我可以在阅读后对其进行解析。我感兴趣的数据是名称、图片、sku 和价格。
HTML:
<script type="application/ld+json">
{
"@context": "http://schema.org/",
"@type": "Product",
"name": "Key Pouch",
"image": "https://us.louisvuitton.com/images/is/image/lv/1/PP_VP_L/louis-vuitton-key-pouch-monogram-gifts-for-women--M62650_PM2_Front view.jpg",
"description": "The Key Pouch in iconic Monogram canvas is a playful yet practical accessory that can carry coins, cards, folded bills and other small items, in addition to keys. Secured with an LV-engraved zip, it can be hooked onto the D-ring inside most Louis Vuitton bags, or used as a bag or belt charm.",
"sku": "M62650",
"brand": {
"@type": "Thing",
"name": "LOUIS VUITTON"
},
"offers": {
"@type": "Offer",
"url" : "https://us.louisvuitton.com/eng-us/products/key-pouch-monogram-000941",
"priceCurrency": "USD",
"price": "215.00",
"availability": "http://schema.org/OutOfStock",
"seller": {
"@type": "Organization",
"name": "LOUIS VUITTON"
}
}
}
</script>
代码:
from bs4 import BeautifulSoup as soup
import requests
import json
HEADERS = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)'}
req = Request("https://us.louisvuitton.com/eng-us/products/key-pouch-monogram-000941", headers= HEADERS)
webpage = urlopen(req).read()
page_soup = soup(webpage, "html.parser")
data = json.loads(page_soup.find('script', type='application/ld+json').text)
print(data)
输出
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
任何帮助将不胜感激。
【问题讨论】:
-
你的脚本是否命名为“json.py”?
-
编辑输出和编辑文件名,谢谢!
标签: python html json beautifulsoup python-requests