本例背景为: 用PDI(Kettle) 向Mysql数据库导入大量的日志分析数据,开始导入的速度300+r/s,

通过设置如下JDBC的连接参数,明显提升了写入的速度。

useServerPrepStmts=false

rewriteBatchedStatements=true

useCompression=true


PDI(Kettle)加速插入数据的速度


原理参考 :http://forums.pentaho.com/showthread.php?142217-Table-Output-Performance-MySQL#9


To remedy this, in PDI I create a separate, specialized Database Connection I use for batch inserts. Set these two MySQL-specific options on your Database Connection:

useServerPrepStmts false
rewriteBatchedStatements true

Used together, these "fake" batch inserts on the client. Specificially, the insert statements:

INSERT INTO t (c1,c2) VALUES ('One',1);
INSERT INTO t (c1,c2) VALUES ('Two',2);
INSERT INTO t (c1,c2) VALUES ('Three',3);

will be rewritten into:

INSERT INTO t (c1,c2) VALUES ('One',1),('Two',2),('Three',3);

So that the batched rows will be inserted with one statement (and one network round-trip). With this simple change, Table Output is very fast and close to performance of the bulk loader steps.

转载于:https://blog.51cto.com/fuqiang82/1628093

相关文章:

  • 2022-12-23
  • 2021-09-03
  • 2022-12-23
  • 2022-01-07
  • 2021-09-28
  • 2021-07-09
  • 2021-10-19
  • 2022-02-21
猜你喜欢
  • 2021-08-09
  • 2021-12-28
  • 2022-12-23
  • 2021-06-25
  • 2021-04-19
  • 2022-12-23
  • 2021-11-08
相关资源
相似解决方案