【发布时间】:2021-05-09 14:05:22
【问题描述】:
我遇到了 Bigquery Python API 的问题。这是我执行脚本时的堆栈跟踪:
Traceback (most recent call last):
File "createTable.py", line 17, in <module>
open_schema()
File "createTable.py", line 12, in open_schema
table = bigquery.Table(table_id, schema=schema)
...
"Schema items must either be fields or compatible "
ValueError: Schema items must either be fields or compatible mapping representations.
脚本很简单,打开一个schema文件并创建表:
from google.cloud import bigquery
# Construct a BigQuery client object.
client = bigquery.Client()
table_id = "project-py-290522:bq_dts.bq-test"
def open_schema():
with open("hcl-schema.json","r", encoding = "utf-8") as fName:
schema = fName.readlines()
table = bigquery.Table(table_id, schema=schema)
print(repr(table))
client.create_table(table) # Make an API request.
if __name__ == "__main__":
open_schema()
print("Created table {}.{}.{}".format(table.project, table.dataset_id, table.table_id))
当我在控制台和 CLI 中执行架构时,表会按原样完美创建。控制台和 CLI 如何执行创建表但在 API 中阻塞。我已经搜索和搜索并没有找到答案。有人可以帮忙吗?
这是存储在 hcl-schema.json 文件中的架构。为简洁起见,我缩短了属性列表,但在其他方面保持不变:
[
{
"name":"user_id",
"type":"STRING",
"mode":"NULLABLE"
},
{
"name":"msg_version",
"type":"STRING",
"mode":"REQUIRED"
},
{
"name":"APIStreamData",
"type":"RECORD",
"mode":"REQUIRED",
"fields":
[
{
"name":"msg_version",
"type":"STRING",
"mode":"REQUIRED"
},
{
"name":"streams",
"type":"RECORD",
"mode":"REPEATED",
"fields":
[
{
"name":"length",
"type":"STRING",
"mode":"REQUIRED"
},
{
"name":"cached",
"type":"STRING",
"mode":"NULLABLE"
},
{
"name":"track",
"type":"RECORD",
"mode":"REQUIRED",
"fields":
[
{
"name":"msg_version",
"type":"STRING",
"mode":"REQUIRED"
},
{
"name":"track_id",
"type":"STRING",
"mode":"REQUIRED"
}
]
}
]
}
]
}
]
谢谢
茫然和困惑
【问题讨论】:
-
可能是文件编码的问题...尝试在
table = bigquery.Table(table_id, schema=schema)之前打印出schema变量
标签: python json google-bigquery