【问题标题】:How to improve performance of a function with cursors in PostgreSQL?如何在 PostgreSQL 中使用游标提高函数的性能?
【发布时间】:2014-03-20 12:30:21
【问题描述】:

我有两个嵌套游标的功能。外部光标从源中获取客户的付款详细信息,并根据某些业务逻辑插入到目标中。内部光标获取每笔付款的付款明细,一次又一次发生。

支付表有大约 125000 行,大约 335000 行用于支付详细信息。所有这些行都将迁移到目标表。执行该函数需要两个多小时,数据库CPU使用率高达99%。
我正在使用 PostgreSQL 9.2。

如何提高函数的性能?

我正在使用的代码:

CREATE OR REPLACE FUNCTION ccdb_dummy.o_payments1(a integer)
  RETURNS void AS
$BODY$
DECLARE
cursor_1  refcursor;
cursor_2  refcursor;
rowcount integer;
rowcount1 integer;
payment_no bigint;
query_1 character varying(250);
lc_cin_num bigint;
test character varying(50);
t_payments ccdb_stg.o_payments_stg%ROWTYPE;
t_payments_details ccdb_stg.o_payment_head_dtls_stg%ROWTYPE;

BEGIN
rowcount := 0;

open cursor_1 for select * from ccdb_stg.o_payments_stg WHERE section_code = a;

select count(1) into rowcount from ccdb_stg.o_payments_stg WHERE section_code = a;

for i IN 1..rowcount loop

fetch cursor_1 into t_payments;

payment_no= nextval('ccdb_stg.payments_seq');

select cin into lc_cin_num from ccdb_dummy.consumers a where a.consumer_num =           t_payments.consumer_num;

insert into  ccdb_dummy.payments(payment_id,receipt_id,source_system_flag,cin,consumer_nbr,cust_connection_id,cust_type_flg,receipt_type_id,mop_code,mop_details,coll_effect_date,coll_entry_date,receipt_num,receipt_amt,receipt_loc_flg,receipt_date,cancel_flag,acc_type_id,cust_section_code,coll_section_code,remarks,pm_paydate,pm_amount,ref_transaction_id,creation_dt,created_by)    values(payment_no,t_payments.receipt_id,t_payments.origin_flag,lc_cin_num,t_payments.consumer_num,t_payments.cust_connection_id,t_payments.cust_type_flag,t_payments.receipt_type_id,t_payments.mop_id,t_payments.mop_details,t_payments.coll_effect_date,t_payments.coll_entry_date,t_payments.receipt_num,t_payments.receipt_amt,t_payments.receipt_flag,t_payments.receipt_date,t_payments.cancel_flag,t_payments.acc_type_flag,t_payments.cust_section_code,t_payments.coll_section_code,t_payments.remarks,t_payments.pm_paydate,t_payments.pm_amount,null,now(),'system');

select count(1) into rowcount1 from ccdb_stg.o_payment_head_dtls_stg a where a.mbc_receipt_id = t_payments.receipt_id;

open cursor_2 for select * from ccdb_stg.o_payment_head_dtls_stg a where a.mbc_receipt_id = t_payments.receipt_id;

for i IN 1..rowcount1 loop

fetch cursor_2 into t_payments_details;

insert into ccdb_dummy.payment_head_dtls(payment_id,mbc_receipt_id,charge_head_code,amount,tariff_id,creation_dt,created_by)
values (payment_no,t_payments_details.mbc_receipt_id,t_payments_details.charge_head_code,t_payments_details.amount,t_payments_details.tariff_id,now(),'system');

end loop;

close cursor_2;

end loop;

close cursor_1;

END;
$BODY$
  LANGUAGE plpgsql;

【问题讨论】:

    标签: function postgresql cursor plpgsql database-performance


    【解决方案1】:

    您的功能,从根本上简化:

    CREATE OR REPLACE FUNCTION ccdb_dummy.o_payments1(a integer)
      RETURNS void
      LANGUAGE plpgsql AS
    $func$
    DECLARE
       t   record;
       t1  record;
    BEGIN
       FOR t IN
          SELECT *
               , nextval('ccdb_stg.payments_seq') AS payment_no
               , c.cin
          FROM   ccdb_stg.o_payments_stg   p
          LEFT   JOIN ccdb_dummy.consumers c USING (consumer_num)
          WHERE  p.section_code = $1
       LOOP
          INSERT INTO ccdb_dummy.payments(payment_id,receipt_id,source_system_flag,cin, ... ,pm_amount,ref_transaction_id,creation_dt,created_by)
          VALUES(t.payment_no,t.receipt_id,t.origin_flag,t.cin, ... ,t.pm_amount,null,now(),'system');
    
          FOR t1 IN
             SELECT *
             FROM   ccdb_stg.o_payment_head_dtls_stg h
             WHERE  h.mbc_receipt_id = t.receipt_id
          LOOP
             INSERT INTO ccdb_dummy.payment_head_dtls(payment_id,mbc_receipt_id,charge_head_code,amount,tariff_id,creation_dt,created_by)
             VALUES (t.payment_no,t1.mbc_receipt_id,t1.charge_head_code,t1.amount,t1.tariff_id,now(),'system');
          END LOOP;
       END LOOP;
    END
    $func$;
    
    • 使用FOR LOOP 的隐式游标,而不是加上冗余计数和循环的笨拙显式游标。 很多更简单,更快。阅读chapter "Looping Through Query Results" in the manual

    • LEFT JOINccdb_dummy.consumers 在第一个 SELECT 中,而不是为每一行运行单独的选择。

    • 还在第一个 SELECT 中包含 nextval('ccdb_stg.payments_seq') AS payment_no。比许多单独的查询便宜。

    但这远非完美。考虑一种全新的方法,使用基于集合的操作,而不是在循环中单独插入。更清洁,更快,然而。这就是现代 RDBMS 的最佳运作方式。

    一个带有数据修改 CTE 的 SQL 语句

    包装到 SQL 函数中以作为替代品。
    Data-modifying CTEs 需要 Postgres 9.1 或更高版本。

    CREATE OR REPLACE FUNCTION ccdb_dummy.o_payments2(integer)
      RETURNS void
      LANGUAGE sql AS
    $func$
       WITH ins1 AS (
          INSERT INTO ccdb_dummy.payments(
                 payment_id,                        cin,  receipt_id, ...  ,   pm_amount, ref_transaction_id,creation_dt,created_by)   
          SELECT nextval('ccdb_stg.payments_seq'),c.cin,p.receipt_id, ...  , p.pm_amount, null,              now(),      'system'
          FROM   ccdb_stg.o_payments_stg   p
          LEFT   JOIN ccdb_dummy.consumers c USING (consumer_num)
          WHERE  p.section_code = $1
          RETURNING payment_id, receipt_id
          )
       INSERT INTO ccdb_dummy.payment_head_dtls(
                payment_id,  mbc_receipt_id,  charge_head_code,  amount,  tariff_id,creation_dt,created_by)
       SELECT i.payment_id,h.mbc_receipt_id,h.charge_head_code,h.amount,h.tariff_id,now(),      'system'
       FROM   ins1 i
       JOIN   ccdb_stg.o_payment_head_dtls_stg h ON h.mbc_receipt_id = i.receipt_id;
    $func$;
    

    应该和上面的 plpgsql 函数完全一样(除非翻译错误)。只是更简单、更快捷

    Find more examples for INSERTs using data-modifying CTEs here on SO.

    【讨论】:

    • 非常感谢......它真的帮了我很多......它减少了我近 30% 的时间跨度
    • @YousufSultan:与您的原始代码相比还是与我的简化版本相比?因为与您的原始代码相比,我希望 更多 改进。
    • 我的原始代码大约需要 4.5 小时,而当我尝试您的简化代码时,大约需要 3 小时。
    • @YousufSultan:这对于 125k - 335k 行来说似乎太慢了。我会重新审视你的服务器配置。对于初学者:Postgres Wikitag info for postgres-performance
    • 当我使用 CTE 时,大大减少了时间。我能够在不到一分钟的时间内达到我的结果。非常感谢队友... :)
    猜你喜欢
    • 1970-01-01
    • 2016-10-12
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-04-25
    • 1970-01-01
    • 2013-12-17
    • 1970-01-01
    相关资源
    最近更新 更多