数据在页面中,但是以Javascript数组的形式。您可以使用 re 和 json 模块提取它,例如:
import re
import json
import requests
url = 'https://232app.azurewebsites.net/Forms/ExclusionRequestItem/13700'
html_data = requests.get(url).text
json_data = json.loads(re.findall(r'function createSourceCountriesTable\(\).*?var arrValues = (.*?);', html_data, flags=re.DOTALL)[0])
print(json.dumps(json_data, indent=4))
打印:
[
{
"OriginCountry": "Spain",
"ExportCountry": "Italy",
"ExclusionQty": "20000",
"Manufacturer": "Rodacciai",
"Supplier": null
},
{
"OriginCountry": "Spain",
"ExportCountry": "Spain",
"ExclusionQty": "3000",
"Manufacturer": "Aceros Inoxidables Olarra",
"Supplier": null
},
{
"OriginCountry": "United Kingdom",
"ExportCountry": "Italy",
"ExclusionQty": "3000",
"Manufacturer": "Rodacciai",
"Supplier": null
}
]