【问题标题】:Data Transfers to Big Query error "Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1"数据传输到 Big Query 错误“读取数据时出错,错误消息:JSON 表遇到太多错误,放弃。行:1;错误:1”
【发布时间】:2021-07-21 15:03:16
【问题描述】:

您好,我目前正在尝试建立从 Google Cloud Storage 到 Big Query 表的每日数据传输。这些表仅用于存储原始数据(JSON 文件),稍后我会使用预定查询将它们取消嵌套。我手动创建表时没有问题,但是当我启动测试传输时出现错误 Error while reading data, error message: JSON table encountered too many errors, giving up. Rows: 1; errors: 1(即使我之前删除了表中的数据)。

以下是原始数据表/JSON 模式的示例:

{
  "fields": [{
    "name": "data",
    "type": "RECORD",
    "mode": "REPEATED",
    "schema": {
      "fields": [{
        "name": "relationships",
        "type": "RECORD",
        "mode": "NULLABLE",
        "schema": {
          "fields": [{
            "name": "createdBy",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "type",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }]
            }
          }, {
            "name": "validationWorkflow",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "RECORD",
                "mode": "REPEATED",
                "schema": {
                  "fields": [{
                    "name": "type",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }]
            }
          }, {
            "name": "projects",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "RECORD",
                "mode": "REPEATED",
                "schema": {
                  "fields": [{
                    "name": "type",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }]
            }
          }, {
            "name": "expensesReport",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "type",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }]
            }
          }, {
            "name": "agency",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "type",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }]
            }
          }, {
            "name": "files",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "STRING",
                "mode": "REPEATED"
              }]
            }
          }, {
            "name": "resource",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "type",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }]
            }
          }, {
            "name": "validations",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "RECORD",
                "mode": "REPEATED",
                "schema": {
                  "fields": [{
                    "name": "type",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }]
            }
          }, {
            "name": "orders",
            "type": "RECORD",
            "mode": "NULLABLE",
            "schema": {
              "fields": [{
                "name": "data",
                "type": "RECORD",
                "mode": "REPEATED",
                "schema": {
                  "fields": [{
                    "name": "type",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }]
            }
          }]
        }
      }, {
        "name": "attributes",
        "type": "RECORD",
        "mode": "NULLABLE",
        "schema": {
          "fields": [{
            "name": "plannedTimes",
            "type": "STRING",
            "mode": "REPEATED"
          }, {
            "name": "state",
            "type": "STRING",
            "mode": "NULLABLE"
          }, {
            "name": "exceptionalTimes",
            "type": "STRING",
            "mode": "REPEATED"
          }, {
            "name": "closed",
            "type": "BOOLEAN",
            "mode": "NULLABLE"
          }, {
            "name": "informationComments",
            "type": "STRING",
            "mode": "NULLABLE"
          }, {
            "name": "regularTimes",
            "type": "RECORD",
            "mode": "REPEATED",
            "schema": {
              "fields": [{
                "name": "batch",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "title",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }, {
                "name": "workUnitType",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "name",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "activityType",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "reference",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }, {
                "name": "project",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "reference",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }, {
                "name": "row",
                "type": "INTEGER",
                "mode": "NULLABLE"
              }, {
                "name": "delivery",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "endDate",
                    "type": "DATE",
                    "mode": "NULLABLE"
                  }, {
                    "name": "startDate",
                    "type": "DATE",
                    "mode": "NULLABLE"
                  }, {
                    "name": "title",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "id",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }, {
                "name": "startDate",
                "type": "DATE",
                "mode": "NULLABLE"
              }, {
                "name": "duration",
                "type": "FLOAT",
                "mode": "NULLABLE"
              }, {
                "name": "id",
                "type": "INTEGER",
                "mode": "NULLABLE"
              }]
            }
          }, {
            "name": "updateDate",
            "type": "TIMESTAMP",
            "mode": "NULLABLE"
          }, {
            "name": "creationDate",
            "type": "TIMESTAMP",
            "mode": "NULLABLE"
          }, {
            "name": "absencesTimes",
            "type": "RECORD",
            "mode": "REPEATED",
            "schema": {
              "fields": [{
                "name": "workUnitType",
                "type": "RECORD",
                "mode": "NULLABLE",
                "schema": {
                  "fields": [{
                    "name": "name",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "activityType",
                    "type": "STRING",
                    "mode": "NULLABLE"
                  }, {
                    "name": "reference",
                    "type": "INTEGER",
                    "mode": "NULLABLE"
                  }]
                }
              }, {
                "name": "duration",
                "type": "FLOAT",
                "mode": "NULLABLE"
              }, {
                "name": "startDate",
                "type": "DATE",
                "mode": "NULLABLE"
              }]
            }
          }, {
            "name": "term",
            "type": "STRING",
            "mode": "NULLABLE"
          }]
        }
      }, {
        "name": "type",
        "type": "STRING",
        "mode": "NULLABLE"
      }, {
        "name": "id",
        "type": "INTEGER",
        "mode": "NULLABLE"
      }]
    }
  }, {
    "name": "date",
    "type": "DATE",
    "mode": "NULLABLE"
  }]
}

我知道 BQ 更适合 JSON 换行分隔格式,但该表必须包含原始数据,即使它最后只有 1 行。最奇怪的是,传输适用于某些具有相似架构的文件。

我应该怎么做才能使这些转移工作?

感谢您的帮助

【问题讨论】:

  • 所以您正在尝试使用数据传输服务将 json 文件从 GCS 加载到 BigQuery 中的表中,该表具有您共享的特定架构,对吧?
  • 是的,就像日常生活一样。

标签: json google-bigquery google-cloud-storage transfer


【解决方案1】:

关于 GCP 文档,我会说using JSON newline delimited format is a known limitation

"JSON data must be newline delimited. Each JSON object must be on a separate line in the file."

【讨论】:

  • 感谢您的回复,但我发现了问题所在。其中一个字段被自动检测为“TIMESTAMP”,并且我的 json 中存在 Google Transfers 无法处理的格式问题
  • 酷!如果问题得到解决,是否可以将解决方案作为答案发布,以便社区更容易,以防其他人在未来面临同样的情况? :) 共同努力保持这个论坛的更新和社区友好是很重要的。谢谢。
  • 完成! (抱歉当时正在度假)
【解决方案2】:

解决办法如下:

手动上传时,某些字段被自动检测为“TIMESTAMP”。但是,在这些表的每日更新期间,Data Transfer 并未将这些字段识别为“TIMESTAMP”。 为了解决这个问题,我编辑了架构并将这些字段声明为“STRING”。

感谢您的帮助;)

【讨论】:

  • 请不要添加“谢谢”作为答案。相反,accept the answer 是您认为最有帮助的。
猜你喜欢
  • 1970-01-01
  • 2022-10-21
  • 1970-01-01
  • 1970-01-01
  • 2020-02-21
  • 1970-01-01
  • 1970-01-01
  • 2018-04-11
  • 1970-01-01
相关资源
最近更新 更多