【发布时间】:2021-12-30 14:24:43
【问题描述】:
我有一个 Python 3.8.5 脚本,它从 API 获取 JSON,保存到磁盘,将 JSON 读取到 DF。它有效。
df = pd.io.json.read_json('json_file', orient='records')
我想尝试使用 IO 缓冲区,这样我就不必读取/写入磁盘,但我遇到了错误。代码是这样的:
from io import StringIO
io = StringIO()
json_out = []
# some code to append API results to json_out
json.dump(json_out, io)
df = pd.io.json.read_json(io.getvalue())
在最后一行我得到了错误
File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\util\_decorators.py", line 199, in wrapper
return func(*args, **kwargs)
File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\util\_decorators.py", line 296, in wrapper
return func(*args, **kwargs)
File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 618, in read_json
result = json_reader.read()
File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 755, in read
obj = self._get_object_parser(self.data)
File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 777, in _get_object_parser
obj = FrameParser(json, **kwargs).parse()
File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 886, in parse
self._parse_no_numpy()
File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 1119, in _parse_no_numpy
loads(json, precise_float=self.precise_float), dtype=None
ValueError: Trailing data
JSON 采用列表格式。所以这不是实际的 json,但当我写入磁盘时它看起来像这样:
json = [
{"state": "North Dakota",
"address": "123 30th st E #206",
"account": "123"
},
{"state": "North Dakota",
"address": "456 30th st E #206",
"account": "456"
}
]
鉴于它在第一种情况下工作(从磁盘写入/读取),我不知道如何排除故障。如何解决缓冲区中的问题?实际数据主要是文本,但有一些数字字段。
【问题讨论】:
标签: json python-3.x pandas