Sqlalchemy：二级关系更新答案

【问题标题】：Sqlalchemy: secondary relationship updateSqlalchemy：二级关系更新
【发布时间】：2012-09-29 01:00:13
【问题描述】：

我有两个表，比如 A 和 B。它们都有一个主键 ID。他们有一个多对多的关系，SEC。

SEC = Table('sec', Base.metadata,
    Column('a_id', Integer, ForeignKey('A.id'), primary_key=True, nullable=False),
    Column('b_id', Integer, ForeignKey('B.id'), primary_key=True, nullable=False)
)

class A():
   ...
   id = Column(Integer, primary_key=True) 
   ...
   rels = relationship(B, secondary=SEC)

class B():
   ...
   id = Column(Integer, primary_key=True) 
   ...

让我们考虑这段代码。

a = A()
b1 = B()
b2 = B()
a.rels = [b1, b2]
...
#some place later
b3 = B()
a.rels = [b1, b3]  # errors sometimes

有时，我在最后一行出现错误提示

duplicate key value violates unique constraint a_b_pkey

据我了解，我认为它会再次尝试将 (a.id, b.id) 添加到“sec”表中，从而导致唯一约束错误。就是这样吗？如果是这样，我该如何避免这种情况？如果没有，为什么会出现这个错误？

【问题讨论】：

标签： python postgresql orm sqlalchemy relationship

【解决方案1】：

您提到的错误确实是在 sec 表中插入了一个冲突的值。为确保它来自您认为的操作，而不是以前的更改，请打开 SQL 日志记录并检查它在出错之前尝试插入的值。

当覆盖多对多集合值时，SQLAlchemy 将集合的新内容与数据库中的状态进行比较，并相应地发出删除和插入语句。除非您在 SQLAlchemy 内部四处寻找，否则应该有两种方法会遇到此错误。

首先是并发修改：进程1获取值a.rels并注意到它是空的，同时进程2也获取a.rels，将其设置为[b1，b2]并提交刷新（a，b1）， (a,b2) 元组，进程 1 将 a.rels 设置为 [b1, b3] 注意到先前的内容是空的，并且当它尝试刷新 sec 元组 (a,b1) 时，它得到一个重复键错误。在这种情况下，正确的操作通常是从顶部重试事务。在这种情况下，您可以使用serializable transaction isolation 来获取序列化错误，这与导致重复键错误的业务逻辑错误不同。

第二种情况发生在您通过将rels 属性的加载策略设置为noload 来说服SQLAlchemy 您不需要知道数据库状态时。这可以在通过添加lazy='noload' 参数定义关系时完成，或者在查询时在查询上调用.options(noload(A.rels))。 SQLAlchemy 将假定 sec 表中没有与使用此策略有效加载的对象匹配的行。

【讨论】：

我不太确定为什么。我需要对其进行正确测试，然后通知您。感谢您的帮助。

【解决方案2】：

问题是您要确保创建的实例是唯一的。我们可以创建一个备用构造函数来检查现有未提交实例的缓存或在返回新实例之前查询数据库中现有已提交实例。

以下是这种方法的演示：

from sqlalchemy import Column, Integer, String, ForeignKey, Table
from sqlalchemy.engine import create_engine
from sqlalchemy.ext.declarative.api import declarative_base
from sqlalchemy.orm import sessionmaker, relationship

engine = create_engine('sqlite:///:memory:', echo=True)
Session = sessionmaker(engine)
Base = declarative_base(engine)

session = Session()


class Role(Base):
    __tablename__ = 'role'

    id = Column(Integer, primary_key=True)
    name = Column(String, nullable=False, unique=True)

    @classmethod
    def get_unique(cls, name):
        # get the session cache, creating it if necessary
        cache = session._unique_cache = getattr(session, '_unique_cache', {})
        # create a key for memoizing
        key = (cls, name)
        # check the cache first
        o = cache.get(key)
        if o is None:
            # check the database if it's not in the cache
            o = session.query(cls).filter_by(name=name).first()
            if o is None:
                # create a new one if it's not in the database
                o = cls(name=name)
                session.add(o)
            # update the cache
            cache[key] = o
        return o


Base.metadata.create_all()

# demonstrate cache check
r1 = Role.get_unique('admin')  # this is new
r2 = Role.get_unique('admin')  # from cache
session.commit()  # doesn't fail

# demonstrate database check
r1 = Role.get_unique('mod')  # this is new
session.commit()
session._unique_cache.clear()  # empty cache
r2 = Role.get_unique('mod')  # from database
session.commit()  # nop

# show final state
print session.query(Role).all()  # two unique instances from four create calls

create_unique 方法的灵感来自example from the SQLAlchemy wiki。这个版本不那么复杂，更倾向于简单而不是灵活性。我在生产系统中使用它没有问题。

显然可以添加一些改进；这只是一个简单的例子。 get_unique 方法可以从 UniqueMixin 继承，用于任意数量的模型。可以实现更灵活的参数记忆。这也搁置了 Ants Aasma 提到的多线程插入冲突数据的问题；处理更复杂但应该是一个明显的扩展。我把它留给你。

【讨论】：