ssslinppp

概述

批量更新mysql数据表数据,上网搜索基本都会说4~5方法,本人使用的更新方式为:
INSERT ... ON DUPLICATE KEY UPDATE Syntax
可参见官方网站:insert-on-duplicate

功能:

  • 表示插入时,如果遇到了主键重复唯一索引重复,则不执行插入操作,而是执行更新操作;

注意点:

  • 这种方式的批量更新,不是sql的规范,而是mysql特有的;
  • 只能针对唯一索引(UNIQUE index) 主键索引(RIMARY KEY)进行更新;
  • 对于自增主键,只会执行插入操作,不会进行更新;
  • 批量更新:values()方法很有用;

性能:

  • 对于数据量比较小的表,速度很快;
  • 对于数据量大的表,性能比较差,建议考虑其他方式;

如果使用Innodb引擎,则可以考虑如下方式(因为Innodb引擎支持事务)

START TRANSACTION;
UPDATE ...
UPDATE ...
UPDATE ...
UPDATE ...
COMMIT;

https://dba.stackexchange.com/questions/28282/whats-the-most-efficient-way-to-batch-update-queries-in-mysql

values(col_name)介绍

values(col_name):表示获取将要插入的列的值,注意是将要插入(would be inserted)


原始表结构和数据

CREATE TABLE `capacity_pm` (
  `id` int(11) NOT NULL AUTO_INCREMENT COMMENT \'自增主键\',
  `pool_id` char(36) CHARACTER SET utf8 DEFAULT NULL COMMENT \'资源池ID\',
  `cluster_lv1` varchar(255) CHARACTER SET utf8 DEFAULT NULL COMMENT \'集群分类\',
  `cluster_lv2` varchar(255) CHARACTER SET utf8 DEFAULT NULL COMMENT \'集群2级分类\',
  `update_at` datetime DEFAULT CURRENT_TIMESTAMP COMMENT \'更新或创建时间\',
  `templete_id` varchar(255) CHARACTER SET utf8 NOT NULL COMMENT \'模板ID\',
  `templete_name` varchar(255) CHARACTER SET utf8 DEFAULT NULL COMMENT \'模板名称\',
  `templete_cpu_core` int(10) unsigned zerofill NOT NULL COMMENT \'模板CPU核数\',
  `templete_mem_size` double NOT NULL COMMENT \'模板内存大小\',
  `templete_disk_size` double NOT NULL COMMENT \'模板磁盘大小\',
  `host_total` int(11) unsigned zerofill DEFAULT NULL COMMENT \'主机总数\',
  `host_used` int(11) unsigned zerofill DEFAULT NULL COMMENT \'主机已分配数量\',
  `cpu_core_total` int(11) unsigned zerofill DEFAULT NULL COMMENT \'cpu总核数\',
  `cpu_core_free` int(11) DEFAULT NULL,
  `cpu_core_used` int(11) DEFAULT NULL COMMENT \'cpu已分配数量\',
  `cpu_core_util` double DEFAULT NULL COMMENT \'cpu核数使用占比\',
  `mem_total` double DEFAULT NULL COMMENT \'内存总空间\',
  `mem_free` double DEFAULT NULL,
  `mem_used` double DEFAULT NULL,
  `mem_util` double DEFAULT NULL COMMENT \'内存使用占比\',
  `disk_total` double DEFAULT NULL,
  `disk_free` double DEFAULT NULL,
  `disk_used` double DEFAULT NULL,
  `disk_util` double DEFAULT NULL COMMENT \'磁盘使用占比\',
  PRIMARY KEY (`id`),
  UNIQUE KEY `idx_templete_all` (`pool_id`,`templete_id`) USING BTREE COMMENT \'模块ID做完整索引\'
) ENGINE=InnoDB AUTO_INCREMENT=70 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;


INSERT INTO `capacity_pm` VALUES (\'1\', \'7b8f0f5e2fbb4d9aa2d5fd55466d638f\', null, null, \'2018-04-11 15:04:31\', \'t001\', \'数据库服务器\', \'0000000000\', \'0\', \'0\', \'00000000100\', \'00000000010\', null, null, null, null, null, null, null, null, null, null, null, null);
INSERT INTO `capacity_pm` VALUES (\'2\', \'7b8f0f5e2fbb4d9aa2d5fd55466d638f\', null, null, \'2018-04-11 15:04:31\', \'t002\', \'性能性服务器\', \'0000000000\', \'0\', \'0\', \'00000000200\', \'00000000020\', null, null, null, null, null, null, null, null, null, null, null, null);
INSERT INTO `capacity_pm` VALUES (\'3\', \'7b8f0f5e2fbb4d9aa2d5fd55466d638f\', null, null, \'2018-04-11 15:04:31\', \'t003\', \'计算型服务器\', \'0000000000\', \'0\', \'0\', \'00000000300\', \'00000000030\', null, null, null, null, null, null, null, null, null, null, null, null);
INSERT INTO `capacity_pm` VALUES (\'4\', \'7b8f0f5e2fbb4d9aa2d5fd55466d638f\', null, null, \'2018-04-11 15:04:31\', \'t004\', \'存储型服务器\', \'0000000000\', \'0\', \'0\', \'00000000400\', \'00000000040\', null, null, null, null, null, null, null, null, null, null, null, null);
INSERT INTO `capacity_pm` VALUES (\'5\', \'7b8f0f5e2fbb4d9aa2d5fd55466d638f\', null, null, \'2018-04-11 15:04:31\', \'t005\', \'网络型服务器\', \'0000000000\', \'0\', \'0\', \'00000000500\', \'00000000050\', null, null, null, null, null, null, null, null, null, null, null, null);
INSERT INTO `capacity_pm` VALUES (\'6\', \'7b8f0f5e2fbb4d9aa2d5fd55466d638e\', null, null, \'2018-04-11 15:04:31\', \'t001\', \'数据库服务器\', \'0000000000\', \'0\', \'0\', \'00000001000\', \'00000000100\', null, null, null, null, null, null, null, null, null, null, null, null);
INSERT INTO `capacity_pm` VALUES (\'7\', \'7b8f0f5e2fbb4d9aa2d5fd55466d638e\', null, null, \'2018-04-11 15:04:31\', \'t002\', \'性能性服务器\', \'0000000000\', \'0\', \'0\', \'00000002000\', \'00000000200\', null, null, null, null, null, null, null, null, null, null, null, null);
INSERT INTO `capacity_pm` VALUES (\'8\', \'7b8f0f5e2fbb4d9aa2d5fd55466d638e\', null, null, \'2018-04-11 15:04:31\', \'t003\', \'计算型服务器\', \'0000000000\', \'0\', \'0\', \'00000003000\', \'00000000300\', null, null, null, null, null, null, null, null, null, null, null, null);
INSERT INTO `capacity_pm` VALUES (\'9\', \'7b8f0f5e2fbb4d9aa2d5fd55466d638e\', null, null, \'2018-04-11 15:04:31\', \'t004\', \'存储型服务器\', \'0000000000\', \'0\', \'0\', \'00000004000\', \'00000000400\', null, null, null, null, null, null, null, null, null, null, null, null);
INSERT INTO `capacity_pm` VALUES (\'10\', \'7b8f0f5e2fbb4d9aa2d5fd55466d638e\', null, null, \'2018-04-11 15:04:31\', \'t005\', \'网络型服务器\', \'0000000000\', \'0\', \'0\', \'00000005000\', \'00000000500\', null, null, null, null, null, null, null, null, null, null, null, null);
INSERT INTO `capacity_pm` VALUES (\'12\', \'7b8f0f5e2fbb4d9aa2d5fd55466d638e\', null, null, \'2018-04-11 08:12:00\', \'t006\', \'自定义服务器\', \'0000000000\', \'0\', \'0\', \'00000006000\', \'00000000600\', null, null, null, null, null, null, null, null, null, null, null, null);
INSERT INTO `capacity_pm` VALUES (\'13\', \'7b8f0f5e2fbb4d9aa2d5fd55466d638e\', null, null, \'2018-04-11 08:12:36\', \'t007\', \'xxx服务器\', \'0000000000\', \'0\', \'0\', \'00000007000\', \'00000000700\', null, null, null, null, null, null, null, null, null, null, null, null);
INSERT INTO `capacity_pm` VALUES (\'14\', \'7b8f0f5e2fbb4d9aa2d5fd55466d638e\', null, null, \'2018-04-11 08:12:36\', \'t00x\', \'服务器xxx\', \'0000000000\', \'0\', \'0\', \'00000008000\', \'00000000800\', null, null, null, null, null, null, null, null, null, null, null, null);

部分数据集查询如下:(该部分为重点测试的数据)

mysql> SELECT pool_id, templete_id, host_total, host_used from capacity_pm ;
+----------------------------------+-------------+------------+-----------+
| pool_id                          | templete_id | host_total | host_used |
+----------------------------------+-------------+------------+-----------+
| 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t001        |        100 |        10 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t002        |        200 |        20 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t003        |        300 |        30 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t004        |        400 |        40 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t005        |        500 |        50 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t001        |       1000 |       100 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t002        |       2000 |       200 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t003        |       3000 |       300 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t004        |       4000 |       400 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t005        |       5000 |       500 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t006        |       6000 |       600 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t007        |       7000 |       700 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t00x        |       8000 |       800 |
+----------------------------------+-------------+------------+-----------+

插入新数据-测试

  • 自增主键:上表中,主键是自增主键,所以该种批量更新方式对自增主键无效(因为自增主键只会insert数据,并不会update);
  • 唯一索引:UNIQUE KEY idx_templete_all (pool_id,templete_id)

待插入的数据,和表中的初始数据有唯一索引重复,索引会执行update操作,而非insert操作;

插入语句为:(所有的host_used都有变化)

INSERT INTO `capacity_pm`(pool_id, templete_id, host_total, host_used) values
(\'7b8f0f5e2fbb4d9aa2d5fd55466d638f\', \'t001\',\'100\', \'15\'),
(\'7b8f0f5e2fbb4d9aa2d5fd55466d638f\', \'t002\',\'200\', \'25\'),
(\'7b8f0f5e2fbb4d9aa2d5fd55466d638f\', \'t003\',\'300\', \'35\'),
(\'7b8f0f5e2fbb4d9aa2d5fd55466d638f\', \'t004\',\'400\', \'45\'),
(\'7b8f0f5e2fbb4d9aa2d5fd55466d638f\', \'t005\',\'500\', \'55\') 
ON DUPLICATE KEY UPDATE host_total=VALUES(host_total), host_used=VALUES(host_used);

ON DUPLICATE KEY UPDATE host_total=VALUES(host_total), host_used=VALUES(host_used)

  • host_total=VALUES(host_total): values(col_name)表示待插入的记录的值;
  • host_used=VALUES(host_used):当需要更新多个col时,使用“,”分割;

插入结果: host_used都发生了变化

mysql> SELECT pool_id, templete_id, host_total, host_used from capacity_pm ;
+----------------------------------+-------------+------------+-----------+
| pool_id                          | templete_id | host_total | host_used |
+----------------------------------+-------------+------------+-----------+
| 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t001        |        100 |        15 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t002        |        200 |        25 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t003        |        300 |        35 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t004        |        400 |        45 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638f | t005        |        500 |        55 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t001        |       1000 |       100 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t002        |       2000 |       200 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t003        |       3000 |       300 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t004        |       4000 |       400 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t005        |       5000 |       500 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t006        |       6000 |       600 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t007        |       7000 |       700 |
| 7b8f0f5e2fbb4d9aa2d5fd55466d638e | t00x        |       8000 |       800 |
+----------------------------------+-------------+------------+-----------+
13 rows in set

其他注意点

唯一索引:ALL唯一索引字段都不能为空,否则无法达到update操作;


性能比较

批量更新5w条数据

INSERT ... ON DUPLICATE KEY UPDATE Syntax

自己的机器上运行,大约30s;

事务批量更新

START TRANSACTION;
UPDATE ...
UPDATE ...
UPDATE ...
UPDATE ...
COMMIT;

测试结果:耗时特别长,不知道具体原因

分类:

技术点:

相关文章: