【问题标题】:ValueError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)ValueError:期望用双引号括起来的属性名称:第 1 行第 2 列(字符 1)
【发布时间】:2017-05-19 17:38:27
【问题描述】:

我制作了一个 python 爬虫,它从特定站点获取 json。

我尝试形成文件,以便提取数据以保存到数据库。

提取脚本函数:

s = page_ad.findAll('script')[27].text.replace('\'', '"')
s = re.search(r'\{.+\}', s, re.DOTALL).group() # get json data
s = re.sub(r'//.+\n', '', s) # replace comment
s = re.sub(r'\s+', '', s) # strip whitspace
s = re.sub(r',}', '}', s) # get rid of last , in the dict

函数后的结果:

{varsource="".toLowerCase();if(mobileSources.indexOf(source)!=-1){returntrue;}returnfalse;}functiongetSource(){varmsiteSources=["mobile","msite"];varuserAgent=navigator.userAgent.toLowerCase();varsource="".toLowerCase();if(mobileSources.indexOf(source)!=-1){if(msiteSources.indexOf(source)!=-1){source="msite";varresultMatch=userAgent.match(/\olx-source\/(\w+);/);if(resultMatch){source=resultMatch[1];}}}else{source="web";}returnsource;}dataLayer=function(){varinitialDatalayer={"config":{"lurkerURL":"},"site":{"isMobile":isMobile(),"source":getSource()},"page":{"pageType":"ad_detail","detail":{"parent_category_id":"2000","category_id":"2020","state_id":"2","region_id":"31","ad_id":"382568903","list_id":"314710679","city_id":"9238","zipcode":"32606174","price":"19900"},"adDetail":{"adID":"382568903","listID":"314710679","sellerName":"MichelleAlcântara","adDate":"2017-03-1113:10:55","mainCategory":"Veículosebarcos","mainCategoryID":"2000","subCategory":"Carros","subCategoryID":"2020","state":"MG","ddd":"31","region":"BeloHorizonteeregião","price":"19900"}},"session":{"user":{"userID":null,"loginType":null}},"pageType":"Ad_detail","abtestingEnable":"1","listingCategory":"2020","adId":"382568903","state":"2","region":"31","category":"2020","pictures":"5","listId":"314710679","loggedUser":"0","referrer":""};if(self.adParams){for(keyinadParams){varpage=initialDatalayer.page;page.detail[key]=adParams[key];if(page.adDetail){page.adDetail[key]=adParams[key];}}}return[initialDatalayer];}

但是当我尝试转换为 json 时,它显示了这个错误。

Json 转换:

dataLayer = json.loads(s)

消息错误:

Traceback (most recent call last):
  File "libs/olx/crawler_ads_information.py", line 100, in <module>
    run(link_base)
  File "libs/olx/crawler_ads_information.py", line 38, in run
    information = getVehicleInformation(page_ad)
  File "libs/olx/crawler_ads_information.py", line 49, in getVehicleInformation
    dataLayer = json.loads(s)
  File "/usr/lib/python2.7/json/__init__.py", line 339, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/json/decoder.py", line 364, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python2.7/json/decoder.py", line 380, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

【问题讨论】:

  • print s 看到了什么?

标签: python json


【解决方案1】:

JSON 是一种序列化的数据结构,而不仅仅是普通的 javascript 代码。

这是一个有效的 JSON:

{"key" : value, "key2" : "value2_string"}

可以“转换”为python dict。

您尝试loads 的字符串只是一个javascript 代码。

您可以在此处获取有关 JSON 的更多信息:http://json.org/

【讨论】:

    猜你喜欢
    • 2018-10-30
    • 1970-01-01
    • 1970-01-01
    • 2021-04-12
    • 2014-12-02
    • 2021-12-31
    • 2014-10-31
    • 2018-06-28
    • 1970-01-01
    相关资源
    最近更新 更多