【问题标题】:Is there a faster way to get big dataframe from python to sql server table?有没有更快的方法从 python 获取大数据帧到 sql server 表?
【发布时间】:2021-02-02 10:01:05
【问题描述】:

我试图向 sql server 获取近 13000000 行,但它给出了资源池错误。代码如下:

def execute_sql_file(pconnection, pfilepath, **psqlparams):
    cursor = pconnection.cursor()
    f = open(pfilepath)
    full_sql = f.read()
    for key, value in psqlparams.items():
        full_sql = full_sql.replace('#' + str(key) + '#', str(value))
    f.close()
    return cursor.execute(full_sql)


def write_result_to_db(dfsave, pServer='oltp', pDatabase='warehouse',pSchema = 'dbo',pTable = 
'tb_table'):
    params = urllib.parse.quote_plus(
    "DRIVER={SQL Server Native Client 11.0};SERVER=" + pServer + ";DATABASE=" + pDatabase + ";   
             Trusted_Connection=yes;MARS_Connection=Yes")
    engine = sqlalchemy.create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
    to_db_connection = engine.connect()

    @sqlalchemy.event.listens_for(engine, "before_cursor_execute")
    def receive_before_cursor_execute(conn, cursor, statement, params, context, executemany):
        if executemany:
            cursor.fast_executemany = True

    chunknum = math.floor(2100 / dfsave.shape[1]) - 1
    if dfsave.shape[0] > 0:
        dfsave.to_sql(pTable, con=engine, schema=pSchema, if_exists='append', index=False, 
                       method='multi',chunksize=chunknum)
    
    return to_db_connection

这就是函数。在进行数学运算之后,在计算结束时,我得到了 df.当我尝试将它发送到 sql server 时,我使用以下代码:

write_result_to_db(df,pTable='tb_table',pServer='testoltp', pSchema = 'dbo', 
                   pDatabase = 'warehouse')

但它给出了错误。然后我尝试像这样划分df并写入sql表:

bol = int(tb_result.shape[0] / 50)
for start in range(0, tb_result.shape[0], bol):
    write_result_to_db(tb_result.iloc[start:start + 
                       bol],pTable='tb_table',pServer='testoltp', pSchema = 
                       'dbo', pDatabase = 'testwarehouse')
    time.sleep(15)

执行此循环后,它会给出资源池错误。我想我不能这样做。 那么如何从 python 获取数据帧到 sql server 表呢?

【问题讨论】:

标签: python dataframe sqlalchemy


【解决方案1】:

尝试使用 modin 库。你也许可以解决它 https://github.com/modin-project/modin

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2019-10-17
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2023-04-08
    • 2017-08-20
    • 1970-01-01
    • 2020-05-04
    相关资源
    最近更新 更多