使用 SQLAlchemy 查询 pandas df 时出现 SAWarning答案

【问题标题】：SAWarning when querying with SQLAlchemy into pandas df使用 SQLAlchemy 查询 pandas df 时出现 SAWarning
【发布时间】：2015-07-06 04:41:03
【问题描述】：

我正在将我的 SQLAlchemy 映射的星型模式直接查询到 pandas DataFrame 中，并从 pandas 收到一个烦人的 SAWarning，我想解决这个问题。这是一个简化版本。

class School(Base):
__tablename__ = 'DimSchool'

id = Column('SchoolKey', Integer, primary_key=True)
name = Column('SchoolName', String)
district = Column('SchoolDistrict', String)


class StudentScore(Base):
__tablename__ = 'FactStudentScore'

StudentKey = Column('StudentKey', Integer,    ForeignKey('DimStudent.StudentKey'), primary_key = True)
SchoolKey = Column('SchoolKey', Integer, ForeignKey('DimSchool.SchoolKey'), primary_key = True)
PointsPossible = Column('PointsPossible', Integer)
PointsReceived = Column('PointsReceived', Integer)

student = relationship("Student", backref='studentscore')
school = relationship("School", backref='studentscore')

我用这样的语句查询日期：

standard = session.query(StudentdScore, School).\
join(School).filter(School.name.like('%Dever%'))

testdf = pd.read_sql(sch.statement, sch.session.bind)

然后得到这个警告：

SAWarning: Column 'SchoolKey' on table <sqlalchemy.sql.selectable.Select at 0x1ab7abe0; Select object> being replaced by Column('SchoolKey', Integer(), table=<Select object>, primary_key=True, nullable=False), which has the same key.  Consider use_labels for select() statements.

我的联接中包含的每个附加表（类）都会出现此错误。消息总是引用外键。

还有其他人遇到此错误并确定根本原因吗？还是你们也一直忽略它？

编辑/更新：

Handling Duplicate Columns in Pandas DataFrame constructor from SQLAlchemy Join

这些家伙似乎在谈论一个相关的问题，但他们使用不同的 pandas 方法来引入数据框并希望保留重复项，而不是丢弃它们。有人对如何实现类似样式的函数有想法，但在查询返回时删除重复项？

【问题讨论】：

标签： python pandas sqlalchemy

【解决方案1】：

对于它的价值，这是我有限的答案。

对于以下 SA 警告：

SAWarning: Column 'SchoolKey' on table <sqlalchemy.sql.selectable.
Select at 0x1ab7abe0; Select object> being replaced by Column('SchoolKey', Integer(), table=<Select object>, primary_key=True, nullable=False), which has the same key.  
Consider use_labels for select() statements.

它确实告诉您存在重复名称的列，即使这些列位于不同的表中。在大多数情况下，这是无害的，因为列是简单的连接键。但是，我遇到过表包含由不同的填充列重复命名的情况（即带有名称列的教师表和带有名称列的学生表）。在这些情况下，请使用 this 之类的方法重命名 pandas 数据框，或重命名基础数据库表。

我会密切注意这个问题，如果有人有更好的答案，我会很乐意奖励答案。

【讨论】：