【问题标题】:How to import JSON to Cosmos DB如何将 JSON 导入 Cosmos DB
【发布时间】:2018-07-05 02:13:32
【问题描述】:

我有一个包含多个数据库的现有 CouchDB 实例。使用 Postman,我将 10 个文档导出到一个 .json 文件。使用 DocumentDB 数据迁移工具,我能够将此文件的内容导入 Cosmos。

但是,我似乎有效地导入了一个文档。 Azure 数据资源管理器表明我的“集合”中有一个文档。

在 CouchDB 中,最高级别的 Identifier 是 doc._id。我尝试将“doc._id”、“id”和“_id”指定为我的 ID 字段,但均无济于事。我究竟做错了什么?如何分别导入每个 JSON 文档?


示例文档:

{
    "_id": "3147cb0e74449e1c28c6ded2b4a3fa45e0d65481-bd_RXMARTINEZ@emailaddress.com_2017-11-30T13:38:33.955Z", 
    "_rev": "3-99aef1458fe1a8f310c83156b9d06a69", 
    "delivery": {
        "application": "EnvTest", 
        "sender": {
            "id": "RXMARTINEZ@emailaddress.com", 
            "type": "user"
        }, 
        "recipients": [
            {
                "type": "email", 
                "recipient": "\"Artzer, Daniel J\" <DJArtzer@emailaddress.com>", 
                "sentTS": "2018-01-30T19:46:31.515Z", 
                "id": "45281ab0-05f6-11e8-a86a-61a54dcb42aa"
            }
        ]
    }, 
    "payload": {
        "startTS": "2017-11-30T07:38:33-06:00", 
        "equipmentOrLineType": "gathering system", 
        "scheduledIndicator": "upset", 
        "field": "Gulf Plains", 
        "eventDateTime": "2017-11-30T07:31:39-06:00", 
        "numCopies": "1", 
        "region": "South", 
        "facility": "", 
        "equipment": "", 
        "causeForRelease": "", 
        "correctiveAction": "", 
        "hgbUsed": "no-hgb-available", 
        "esdActivated": "no", 
        "submitTS": "2017-11-30T13:38:33.940Z", 
        "lineSegment": [
            {
                "lineName": "GPL-1", 
                "Diameter": "12.725", 
                "Length": "1000", 
                "duration": "no", 
                "volume": "3"
            }
        ], 
        "creator": "RXMARTINEZ@emailaddress.com", 
        "savedTS": "2017-11-30T13:38:33.955Z", 
        "trainingDoc": true, 
        "_id": "bd_RXMARTINEZ@emailaddress.com_2017-11-30T13:38:33.955Z", 
        "syncTS": "2017-11-30T13:38:34.794Z"
    }
}

【问题讨论】:

    标签: javascript azure azure-cosmosdb


    【解决方案1】:

    我使用 DocumentDB 数据迁移工具导入了一个包含 4 个文档的 .json 文件,它对我来说效果很好。

    我的.json 文件:

    [
        {
            "name":"jay1",
            "age":20
        },
        {
            "name":"jay2",
            "age":20
        },
        {
            "name":"jay3",
            "age":20
        },
        {
            "name":"jay4",
            "age":20
        }
    ]
    

    文档分别导入到集合中。

    JSON 文件源导入器选项允许您导入一个或多个 单个文档 JSON 文件或 JSON 文件,每个文件都包含一个数组 JSON 文档。添加包含要导入的 JSON 文件的文件夹时, 您可以选择递归搜索子文件夹中的文件。

    您可以检查您是否遵循official tutorial 中所述的上述规则来创建您的 .json 文件。 (例如,您的文档是否在数组中?)

    希望对你有帮助。


    更新答案:

    我厌倦了将遵循您在回复中提到的格式的数据导入cosmos db。

    我的更新.json文件:

    [
        {
            "id":"001",
            "name":"jay1",
            "age":20,
            "array": {
                "a":1,
                "b":2,
                "c":{
                    "d":3,
                    "e": {
                        "f":4
                    }
                }
            }
        },
        {
            "id":"002",
            "name":"jay2",
            "age":20,
            "array": {
                "a":1,
                "b":2,
                "c":{
                    "d":3,
                    "e": {
                        "f":4
                    }
                }
            }
        },
        {
            "id":"003",
            "name":"jay3",
            "age":20,
            "array": {
                "a":1,
                "b":2,
                "c":{
                    "d":3,
                    "e": {
                        "f":4
                    }
                }
            }
        }
    ]
    

    文档成功导入集合。


    更新答案 2:

    我按照您提供的文档格式在我的 .json 文件中创建了 3 个示例文档。(我只是将“_id”属性更新为“testX”)

    我的.json 文件:

    [
        {
            "_id": "test1", 
            "_rev": "3-99aef1458fe1a8f310c83156b9d06a69", 
            "delivery": {
                "application": "EnvTest", 
                "sender": {
                    "id": "RXMARTINEZ@emailaddress.com", 
                    "type": "user"
                }, 
                "recipients": [
                    {
                        "type": "email", 
                        "recipient": "\"Artzer, Daniel J\" <DJArtzer@emailaddress.com>", 
                        "sentTS": "2018-01-30T19:46:31.515Z", 
                        "id": "45281ab0-05f6-11e8-a86a-61a54dcb42aa"
                    }
                ]
            }, 
            "payload": {
                "startTS": "2017-11-30T07:38:33-06:00", 
                "equipmentOrLineType": "gathering system", 
                "scheduledIndicator": "upset", 
                "field": "Gulf Plains", 
                "eventDateTime": "2017-11-30T07:31:39-06:00", 
                "numCopies": "1", 
                "region": "South", 
                "facility": "", 
                "equipment": "", 
                "causeForRelease": "", 
                "correctiveAction": "", 
                "hgbUsed": "no-hgb-available", 
                "esdActivated": "no", 
                "submitTS": "2017-11-30T13:38:33.940Z", 
                "lineSegment": [
                    {
                        "lineName": "GPL-1", 
                        "Diameter": "12.725", 
                        "Length": "1000", 
                        "duration": "no", 
                        "volume": "3"
                    }
                ], 
                "creator": "RXMARTINEZ@emailaddress.com", 
                "savedTS": "2017-11-30T13:38:33.955Z", 
                "trainingDoc": true, 
                "_id": "bd_RXMARTINEZ@emailaddress.com_2017-11-30T13:38:33.955Z", 
                "syncTS": "2017-11-30T13:38:34.794Z"
            }
        },
        {
            "_id": "test2", 
            "_rev": "3-99aef1458fe1a8f310c83156b9d06a69", 
            "delivery": {
                "application": "EnvTest", 
                "sender": {
                    "id": "RXMARTINEZ@emailaddress.com", 
                    "type": "user"
                }, 
                "recipients": [
                    {
                        "type": "email", 
                        "recipient": "\"Artzer, Daniel J\" <DJArtzer@emailaddress.com>", 
                        "sentTS": "2018-01-30T19:46:31.515Z", 
                        "id": "45281ab0-05f6-11e8-a86a-61a54dcb42aa"
                    }
                ]
            }, 
            "payload": {
                "startTS": "2017-11-30T07:38:33-06:00", 
                "equipmentOrLineType": "gathering system", 
                "scheduledIndicator": "upset", 
                "field": "Gulf Plains", 
                "eventDateTime": "2017-11-30T07:31:39-06:00", 
                "numCopies": "1", 
                "region": "South", 
                "facility": "", 
                "equipment": "", 
                "causeForRelease": "", 
                "correctiveAction": "", 
                "hgbUsed": "no-hgb-available", 
                "esdActivated": "no", 
                "submitTS": "2017-11-30T13:38:33.940Z", 
                "lineSegment": [
                    {
                        "lineName": "GPL-1", 
                        "Diameter": "12.725", 
                        "Length": "1000", 
                        "duration": "no", 
                        "volume": "3"
                    }
                ], 
                "creator": "RXMARTINEZ@emailaddress.com", 
                "savedTS": "2017-11-30T13:38:33.955Z", 
                "trainingDoc": true, 
                "_id": "bd_RXMARTINEZ@emailaddress.com_2017-11-30T13:38:33.955Z", 
                "syncTS": "2017-11-30T13:38:34.794Z"
            }
        },
        {
            "_id": "test3", 
            "_rev": "3-99aef1458fe1a8f310c83156b9d06a69", 
            "delivery": {
                "application": "EnvTest", 
                "sender": {
                    "id": "RXMARTINEZ@emailaddress.com", 
                    "type": "user"
                }, 
                "recipients": [
                    {
                        "type": "email", 
                        "recipient": "\"Artzer, Daniel J\" <DJArtzer@emailaddress.com>", 
                        "sentTS": "2018-01-30T19:46:31.515Z", 
                        "id": "45281ab0-05f6-11e8-a86a-61a54dcb42aa"
                    }
                ]
            }, 
            "payload": {
                "startTS": "2017-11-30T07:38:33-06:00", 
                "equipmentOrLineType": "gathering system", 
                "scheduledIndicator": "upset", 
                "field": "Gulf Plains", 
                "eventDateTime": "2017-11-30T07:31:39-06:00", 
                "numCopies": "1", 
                "region": "South", 
                "facility": "", 
                "equipment": "", 
                "causeForRelease": "", 
                "correctiveAction": "", 
                "hgbUsed": "no-hgb-available", 
                "esdActivated": "no", 
                "submitTS": "2017-11-30T13:38:33.940Z", 
                "lineSegment": [
                    {
                        "lineName": "GPL-1", 
                        "Diameter": "12.725", 
                        "Length": "1000", 
                        "duration": "no", 
                        "volume": "3"
                    }
                ], 
                "creator": "RXMARTINEZ@emailaddress.com", 
                "savedTS": "2017-11-30T13:38:33.955Z", 
                "trainingDoc": true, 
                "_id": "bd_RXMARTINEZ@emailaddress.com_2017-11-30T13:38:33.955Z", 
                "syncTS": "2017-11-30T13:38:34.794Z"
            }
        }
    ]
    

    文档已导入数据库,没有发生错误。

    如果你我之间有什么不同,请告诉我。

    【讨论】:

    • 谢谢。也许我的问题是我的 JSON 文档比允许的复杂。我保存了一个包含多个 JSON 文档的文件,每个文档都包含 3 个嵌套对象。也许这种格式在某些方面不兼容?如果有帮助,我可以提供一个样本。
    • @CarCrazyBen 如果您提供要导入的示例文档,这将很有帮助。我将通过该工具进行测试。
    • 这里是一个示例文档。这可能是一团糟,因为我必须分两部分发送:{“_id”:“3147cb0e74449e1c28c6ded2b4a3fa45e0d65481-bd_RXMARTINEZ@emailaddress.com_2017-11-30T13:38:33.955Z”,“_rev”:“3-99aef1458fe1a8f310c交付”:{“应用程序”:“EnvTest”,“发件人”:{“id”:“RXMARTINEZ@emailaddress.com”,“类型”:“用户”},“收件人”:[{“类型”:“电子邮件","re​​cipient": "\"Artzer, Daniel J\" ", "sentTS": "2018-01-30T19:46:31.515Z","id": "45281ab0-05f6- 11e8-a86a-61a54dcb42aa"}, ] },
    • "payload": { "startTS": "2017-11-30T07:38:33-06:00", "equipmentOrLineType": "采集系统", "scheduledIndicator": "不安", “字段”:“海湾平原”,“eventDateTime”:“2017-11-30T07:31:39-06:00”,“numCopies”:“1”,“区域”:“南部”,“设施”:“ ","设备":"","causeForRelease":"","correctiveAction":"","hgbUsed":"no-hgb-available","esdActivated":"no","submitTS":"2017- 11-30T13:38:33.940Z",
    • "lineSegment": [ {"lineName": "GPL-1", "Diameter": "12.725", "Length": "1000", "duration": "no", "volume ": "3"} ], "创建者": "RXMARTINEZ@emailaddress.com", "savedTS": "2017-11-30T13:38:33.955Z", "trainingDoc": true, "_id": "bd_RXMARTINEZ@ emailaddress.com_2017-11-30T13:38:33.955Z", "syncTS": "2017-11-30T13:38:34.794Z" } } 更正,3 部分。
    【解决方案2】:

    另一个答案是大约一年前,它使用dtui.exe

    dtui.exe 是一个 GUI 工具
    dt.exe 是一个命令行工具

    即日起可从https://www.microsoft.com/en-us/download/details.aspx?id=46436下载两者

    以下解决方案使用 dt.exe(自动化您的流程,以便将来可以重复且不会出现问题)

    `"C:\path-to-your-dt-exe\dt.exe"  /s:JsonFile   /s.Files:"C:\path-to-your-json-files-of-one-collection-only\*.json"   /t:DocumentDB    /t.ConnectionString:"AccountEndpoint=https://localhost:8081/;AccountKey=<big account key here>;Database=<your db name here>"   /t.Collection:<your collection name here>   /t.PartitionKey:<your partition key here>   /t.CollectionThroughput:2500`
    

    注意
    - 上面的数据库(&lt;your db name here&gt;)和集合(&lt;your collection name here&gt;),必须已经存在
    - 将&lt;...&gt; 替换为上面的数据,然后在cmd 中运行。
    - /s是源,/t是目标,这里源是JSON文件,目标是本地documentBD

    希望对您有所帮助。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多