【问题标题】:Convert nested dict/json to a django ORM model, without hard coding the data structure将嵌套的 dict/json 转换为 django ORM 模型,无需对数据结构进行硬编码
【发布时间】:2017-09-28 18:57:30
【问题描述】:

我想将 json 文件中的数据导入到我的 django 数据库中。 json 包含嵌套对象。

目前的步骤是:

  1. 设置我的 django 对象模型以匹配 json 架构(手动完成 - 请参阅下面的 models.py 文件)
  2. 使用mydict = json.loads(file.read())将json文件导入python dict(完成)
  3. 将 dict 转换为 django 模型(完成 - 但解决方案并不漂亮)

有没有一种方法可以将嵌套的 dict 转换为 django 模型(即第 3 步),而无需将数据结构硬编码到逻辑中?

基于示例 json 文件自动生成 django 模型(即models.py 文件)的奖励积分。

提前致谢!

我现在是怎么做的

如果字典不包含任何嵌套字典,则第 3 步很容易 - 只需 construct a new object from the dictMyModel.objects.create(**mydict)use django fixtures

但是,由于我的 json/dict 包含嵌套对象,我目前正在执行第 3 步,如下所示:

# read the json file into a python dict
d = json.loads(myfile.read())

# construct top-level object using the top-level dict
# (excluding nested lists of dicts called 'judges' and 'contestants')
c = Contest.objects.create(**{k:v for k,v in d.items() if k not in ('judges', 'contestants')})

# construct nested objects using the nested dicts
for judge in d['judges']:
    c.judge_set.create(**judge)
for contestant in d['contestants']:
    ct = c.contestant_set.create(**{k:v for k,v in contestant.items() if k not in ('singers', 'songs')})
    # all contestants sing songs
    for song in contestant['songs']:
        ct.song_set.create(**song)
    # not all contestants have a list of singers
    if 'singers' in contestant:
        for singer in contestant['singers']:
            ct.singer_set.create(**singer)

这可行,但需要将数据结构硬编码到逻辑中:

  • 在调用create() 时需要对要排除的嵌套字典的名称进行硬编码(如果您尝试将嵌套字典传递给create(),则会抛出TypeError)。我考虑过使用**{k:v for k,v in contestant.items() if not hasattr(v, 'pop')} 来排除列表和字典,但我怀疑这不会 100% 起作用。
  • 需要对逻辑进行硬编码以迭代地创建嵌套对象
  • 需要硬编码逻辑来处理并不总是存在的嵌套对象

数据结构

示例 json 如下所示:

{
  "assoc": "THE BRITISH ASSOCIATION OF BARBERSHOP SINGERS",
  "contest": "QUARTET FINAL (NATIONAL STREAM)",
  "location": "CHELTENHAM",
  "year": "2007/08",
  "date": "25/05/2008",
  "type": "quartet final",
  "filename": "BABS/2008QF.pdf"
  "judges": [
    {"cat": "m", "name": "Rod"},
    {"cat": "m", "name": "Bob"},
    {"cat": "p", "name": "Pat"},
    {"cat": "p", "name": "Bob"},
    {"cat": "s", "name": "Mark"},
    {"cat": "s", "name": "Barry"},
    {"cat": "a", "name": "Phil"}
  ],
  "contestants": [
    {
      "prev_tot_score": "1393",
      "tot_score": "2774",
      "rank_m": "1",
      "rank_s": "1",
      "rank_p": "1",
      "rank": "1", "name": "Monkey Magic",
      "pc_score": "77.1",
      "songs": [
        {"title": "Undecided Medley","m": "234","s": "226","p": "241"},
        {"title": "What Kind Of Fool Am I","m": "232","s": "230","p": "230"},
        {"title": "Previous","m": "465","s": "462","p": "454"}
      ],
      "singers": [
        {"part": "tenor","name": "Alan"},
        {"part": "lead","name": "Zac"},
        {"part": "bari","name": "Joe"},
        {"part": "bass","name": "Duncan"}
      ]
    },
    {
      "prev_tot_score": "1342",
      "tot_score": "2690",
      "rank_m": "2",
      "rank_s": "2",
      "rank_p": "2",
      "rank": "2", "name": "Evolution",
      "pc_score": "74.7",
      "songs": [
        {"title": "It's Impossible","m": "224","s": "225","p": "218"},
        {"title": "Come Fly With Me","m": "225","s": "222","p": "228"},
        {"title": "Previous","m": "448","s": "453","p": "447"}
      ],
      "singers": [
        {"part": "tenor","name": "Tony"},
        {"part": "lead","name": "Michael"},
        {"part": "bari","name": "Geoff"},
        {"part": "bass","name": "Stuart"}
      ]
    },
  ],
}

我的 models.py 文件:

from django.db import models

# Create your models here.

class Contest(models.Model):
    assoc = models.CharField(max_length=100)
    contest = models.CharField(max_length=100)
    date = models.DateField()
    filename = models.CharField(max_length=100)
    location = models.CharField(max_length=100)
    type = models.CharField(max_length=20)
    year = models.CharField(max_length=20)


class Judge(models.Model):
    contest = models.ForeignKey(Contest, on_delete=models.CASCADE)
    name = models.CharField(max_length=60)
    cat = models.CharField('Category', max_length=2)


class Contestant(models.Model):
    contest = models.ForeignKey(Contest, on_delete=models.CASCADE)
    name = models.CharField(max_length=100)
    tot_score = models.IntegerField('Total Score')
    rank_m = models.IntegerField()
    rank_s = models.IntegerField()
    rank_p = models.IntegerField()
    rank = models.IntegerField()
    pc_score = models.DecimalField(max_digits=4, decimal_places=1)
    # optional fields
    director = models.CharField(max_length=100, blank=True, null=True)
    size = models.IntegerField(blank=True, null=True)
    prev_tot_score = models.IntegerField(blank=True, null=True)


class Song(models.Model):
    contestant = models.ForeignKey(Contestant, on_delete=models.CASCADE)
    title = models.CharField(max_length=100)
    m = models.IntegerField('Music')
    s = models.IntegerField('Singing')
    p = models.IntegerField('Performance')

class Singer(models.Model):
    contestant = models.ForeignKey(Contestant, on_delete=models.CASCADE)
    name = models.CharField(max_length=100)
    part = models.CharField('Category', max_length=5)

【问题讨论】:

  • 根据定义,对象模型包含数据模型的描述。如果您不知道数据的结构,我非常怀疑您是否可以对其进行建模。您最好的选择可能是使用 ID 将嵌套数据提取到单独的模型中,然后通过匹配该 ID 来导入它们,但这将是很多工作。

标签: python json django dictionary


【解决方案1】:

您可以递归浏览 json 对象并使用类映射的键来动态实例化您的模型。这是一个想法(不是一个可行的解决方案!):

 key_model = {
        "contestants": Contestant,
        "singers": Singer
 }

 def make_sub_model(parent, model, vals):
    for v in vals:
       child = create_model(model, v)
       parent.add_child(child) # or whatever it is with Django Models

def create_model(model, obj):
    # model should be the class and obj a dict

    # take care of the top lvl object
    to_process = [] # store nest models
    parent = {} # store parent attributes
    for k, v in obj.items():
        if isinstance(v, list): # you probably want dict as well
            to_process.append((k, v))
        else:
           parent[k] = v

    parent_obj = model.create(**parent)
    # now process the chidlrend
    for k, v in to_process:
        make_sub_model(parent_obj, key_model[k], v)

    return parent_obj

但最后,我会不鼓励这样做,因为您使用的是基于架构的存储 (SQL),因此您的代码应该强制输入与您的架构匹配(您无法处理任何不同的无论如何,苍蝇)。如果您根本不关心架构,请选择 No-SQL 解决方案,您就不会遇到这个问题。或者像 PostgresSQL 这样的混合体。

【讨论】:

    猜你喜欢
    • 2021-11-05
    • 2021-08-26
    • 2021-04-13
    • 1970-01-01
    • 1970-01-01
    • 2020-03-07
    • 2015-12-11
    • 1970-01-01
    • 2014-06-26
    相关资源
    最近更新 更多