【发布时间】:2016-03-23 18:47:07
【问题描述】:
Firebase 提供private backups on Google Cloud Storage。其中一个特色用例是“引入分析产品”:
Private Backups provides a perfect pipeline into cloud analytics products such as Google’s BigQuery. Cloud Analytics products often prefer to ingest data through Cloud Storage buckets rather than directly from the application.
我在 Firebase 中有大量数据(导出到 Cloud Storage 存储桶时超过 1GB),如 Firebase 产品中所述,我想将这些数据放入 Big Query。
但是真的可以编写适合 Firebase 原始数据的表架构吗? 让我们以 Firebase 文档中的 dinosaur-facts 数据库为例。 JSON 如下所示:
{
"dinosaurs" : {
"bruhathkayosaurus" : {
"appeared" : -70000000,
"height" : 25
},
"lambeosaurus" : {
"appeared" : -76000000,
"height" : 2.1
}
},
"scores" : {
"bruhathkayosaurus" : 55,
"lambeosaurus" : 21
}
}
要列出所有恐龙,我想唯一的方法是在 bigQuery 架构中使用 RECORD 字段。但通常 BigQuery 中的 RECORDS 对应于导入的 JSON 中的一个数组。而且Firebase这里没有数组,只是一个以恐龙名字作为键名的对象。
因此,这样的 BigQuery 表架构不起作用:
[
{
"name": "dinosaurs",
"type": "RECORD",
"mode": "REQUIRED",
"fields": [
{
"name": "dinosaur",
"type": "RECORD",
"mode": "REPEATED",
"fields": [
{
"name": "appeared",
"type": "INTEGER"
},
{
"name": "height",
"type": "INTEGER"
},
{
"name": "length",
"type": "INTEGER"
},
{
"name": "order",
"type": "STRING"
},
{
"name": "vanished",
"type": "INTEGER"
},
{
"name": "weight",
"type": "INTEGER"
}
]
},
{
"name": "scores",
"type": "RECORD",
"mode": "REPEATED",
"fields": [
{
"name": "dinosaur",
"type": "INTEGER"
}
]
}
]
}
]
是否可以编写适合 Firebase 原始数据的表架构?还是我们应该先准备数据以使其与 BigQuery 兼容?
【问题讨论】:
标签: json firebase google-bigquery