【问题标题】:Python unable to parse Json file with error "raise JSONDecodeError("Extra data", s, end) json.decoder.JSONDecodeError: Extra data"Python 无法解析 Json 文件并出现错误“引发 JSONDecodeError("Extra data", s, end) json.decoder.JSONDecodeError: Extra data"
【发布时间】:2021-02-12 23:29:24
【问题描述】:

我正在尝试从 API 下载 Json 文件并将其转换为 csv 文件,但脚本在解析 json 文件时抛出以下错误。

对于每 100 条记录,json 文件关闭“]”并启动另一个“[”。此格式不被接受为 json 格式。您能否建议我如何以有效的方式解析每 100 条记录出现的“]”和“[”。该代码适用于不带 [] 括号的少于 100 条记录。

Error message:

raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data
 

Json 文件格式:

**[**
    {
        "A": "5",
        "B": "811",
        "C": [
            {   "C1": 1,
                "C2": "sa",
                "C3": 3
                
            }
        ],
        "D": "HH",
        "E": 0,
        "F": 6
    },
    {
        "A": "5",
        "B": "811",
        "C": [
            {   "C1": 1,
                "C2": "fa",
                "C3": 3
                
            }
        ],
        "D": "HH",
        "E": 0,
        "F": 6
    }
    **]**
    **[**
    {
        "A": "5",
        "B": "811",
        "C": [
            {   "C1": 1,
                "C2": "da",
                "C3": 3
                
            }
        ],
        "D": "HH",
        "E": 0,
        "F": 6
    }
    **]**
     

代码:

import json
import pandas as pd
from flatten_json import flatten

def json2excel():
    file_path = r"<local file path>"
    json_list = json.load(open(file_path + '.json', 'r', encoding='utf-8', errors='ignore'))
    key_list = ['A', 'B']
    json_list = [{k: d[k] for k in key_list} for d in json_list]
    # Flatten and convert to a data frame
    json_list_flattened = (flatten(d, '.') for d in json_list)
    df = pd.DataFrame(json_list_flattened)
    # Export to CSV in the same directory with the original file name
    export_csv = df.to_csv(file_path + r'.csv', sep=',', encoding='utf-8', index=None, header=True)

def main():
    json2excel()

【问题讨论】:

    标签: python json python-3.x


    【解决方案1】:

    我建议首先解析您从 API 收到的数据。此预处理数据可以稍后馈送到 JSON 解析器。

    我想出了一个简单的 python 代码,它只是对括号匹配问题的解决方案的一个小调整。这是我的工作代码,您可以使用它来预处理您的数据。

    def build_json_items(custom_json): 
        open_tup = tuple('({[') 
        close_tup = tuple(')}]') 
        map = dict(zip(open_tup, close_tup)) 
        queue = [] 
        
        json_items = []
        temp = ""
        for i in custom_json:
            if i in open_tup: 
                queue.append(map[i]) 
            elif i in close_tup: 
                if not queue or i != queue.pop(): 
                    return "Unbalanced"
            
            if len(queue) == 0: 
                # We have reached to a point where everything so far is balanced. 
                # This is the point where we can separate out the expression
                temp = temp + str(i)
                json_items.append(temp)
                temp = "" # Re-initialize
            else:
                temp = temp + str(i)
    
        if not queue:
            # Provided string is balanced
            return True, json_items
        else: 
            return False, json_items
    

    build_json_items 函数将采用您的自定义 JSON 负载,并根据您在问题中提供的信息解析各个有效的 JSON 项目。这是一个如何触发此功能的示例。您可以使用以下内容。

    input_data = "[{\"A\":\"5\",\"B\":\"811\",\"C\":[{\"C1\":1,\"C2\":\"sa\",\"C3\":3}],\"D\":\"HH\",\"E\":0,\"F\":6},{\"A\":\"5\",\"B\":\"811\",\"C\":[{\"C1\":1,\"C2\":\"fa\",\"C3\":3}],\"D\":\"HH\",\"E\":0,\"F\":6}][{\"A\":\"5\",\"B\":\"811\",\"C\":[{\"C1\":1,\"C2\":\"da\",\"C3\":3}],\"D\":\"HH\",\"E\":0,\"F\":6}]"
    
    is_balanced, json_items = build_json_items(input_data)
    print(f"Available JSON items: {len(json_items)}")
    print("JSON items are the following")
    for i in json_items: 
        print(i)
    

    这是打印语句的输出。

    Available JSON items: 2
    JSON items are the following
    [{"A":"5","B":"811","C":[{"C1":1,"C2":"sa","C3":3}],"D":"HH","E":0,"F":6},{"A":"5","B":"811","C":[{"C1":1,"C2":"fa","C3":3}],"D":"HH","E":0,"F":6}]
    [{"A":"5","B":"811","C":[{"C1":1,"C2":"da","C3":3}],"D":"HH","E":0,"F":6}]
    

    你可以directly run and see the output here

    一旦您将这些有效负载分隔为有效的 JSON 结构,您就可以将它们提供给您的 JSON 解析器。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2023-01-02
      • 2019-01-25
      • 2023-02-04
      • 2016-09-18
      • 2021-11-23
      • 1970-01-01
      • 2014-01-30
      相关资源
      最近更新 更多