【发布时间】:2018-07-22 23:47:39
【问题描述】:
我需要将多个 excel 文件上传到 postgresql 表,但它们可以在多个寄存器中相互重叠,因此我需要注意 IntegrityErrors。我遵循两种方法:
cursor.copy_from:最快的方法,但由于寄存器重复,我不知道如何捕获和控制所有Integrityerrors
streamCSV = StringIO()
streamCSV.write(invoicing_info.to_csv(index=None, header=None, sep=';'))
streamCSV.seek(0)
with conn.cursor() as c:
c.copy_from(streamCSV, "staging.table_name", columns=dataframe.columns, sep=';')
conn.commit()
cursor.execute:我可以统计和处理每个异常,但它非常
慢。
data = invoicing_info.to_dict(orient='records')
with cursor as c:
for entry in data:
try:
c.execute(DLL_INSERT, entry)
successful_inserts += 1
connection.commit()
print('Successful insert. Operation number {}'.format(successful_inserts))
except psycopg2.IntegrityError as duplicate:
duplicate_registers += 1
connection.rollback()
print('Duplicate entry. Operation number {}'.format(duplicate_registers))
在例程结束时,我需要确定以下信息:
print("Initial shape: {}".format(invoicing_info.shape))
print("Successful inserts: {}".format(successful_inserts))
print("Duplicate entries: {}".format(duplicate_registers))
如何修改第一种方法来控制所有异常?如何优化第二种方法?
【问题讨论】:
标签: postgresql python-3.6 psycopg2