【发布时间】:2015-03-03 08:08:26
【问题描述】:
情况:我在表上创建了唯一的复合索引,这需要一些时间,没有删除重复的记录,也没有阻止我插入重复的行。
有人知道这里发生了什么吗?
表结构:
> DESCRIBE translations;
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| text | text | NO | | NULL | |
| language_id | int(11) | NO | MUL | NULL | |
| parent_id | int(11) | YES | MUL | NULL | |
| type | varchar(255) | YES | MUL | NULL | |
| flag | varchar(255) | YES | MUL | NULL | |
+-------------+--------------+------+-----+---------+----------------+
索引创建:
> ALTER IGNORE TABLE `translations`
ADD UNIQUE `unique_translations`
(`language_id`, `parent_id`, `type`, `flag`);
Query OK, 12225526 rows affected (4 min 51.91 sec)
Records: 12225526 Duplicates: 0 Warnings: 0
索引列表:
> SHOW INDEXES FROM `translations`;
+--------------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| translations | 0 | PRIMARY | 1 | id | A | 12178547 | NULL | NULL | | BTREE | | |
| translations | 0 | unique_translations | 1 | language_id | A | 2712 | NULL | NULL | | BTREE | | |
| translations | 0 | unique_translations | 2 | parent_id | A | 2435709 | NULL | NULL | YES | BTREE | | |
| translations | 0 | unique_translations | 3 | type | A | 2435709 | NULL | NULL | YES | BTREE | | |
| translations | 0 | unique_translations | 4 | flag | A | 3044636 | NULL | NULL | YES | BTREE | | |
| translations | 1 | language_id_fk | 1 | language_id | A | 26 | NULL | NULL | | BTREE | | |
| translations | 1 | parent_id_fk | 1 | parent_id | A | 1522318 | NULL | NULL | YES | BTREE | | |
| translations | 1 | flag | 1 | flag | A | 10562 | NULL | NULL | YES | BTREE | | |
| translations | 1 | type | 1 | type | A | 30370 | NULL | NULL | YES | BTREE | | |
+--------------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
问题验证:
> SELECT COUNT(`id`) FROM `translations`;
+-------------+
| COUNT(`id`) |
+-------------+
| 12225526 |
+-------------+
1 row in set (3.29 sec)
> SELECT * FROM `translations` ORDER BY `id` DESC LIMIT 1;
+----------+----------------+-------------+-----------+--------------+-------------+
| id | text | language_id | parent_id | type | flag |
+----------+----------------+-------------+-----------+--------------+-------------+
| 13754252 | text | 50 | NULL | text2 | text3 |
+----------+----------------+-------------+-----------+--------------+-------------+
1 row in set (0.01 sec)
> INSERT INTO `translations` VALUES (NULL, "text", 50, NULL, "text2", "text3");
Query OK, 1 row affected (0.00 sec)
> SELECT COUNT(`id`) FROM `translations`;
+-------------+
| COUNT(`id`) |
+-------------+
| 12225527 |
+-------------+
1 row in set (2.19 sec)
机器信息:
root@precise64:~# uname -a
Linux precise64 3.13.0-43-generic #72-Ubuntu SMP Mon Dec 8 19:35:06 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
root@precise64:~# dpkg -l | grep -i mariadb
ii libmariadbclient18 10.0.14+maria-1~precise amd64 MariaDB database client library
ii mariadb-client-10.0 10.0.14+maria-1~precise amd64 MariaDB database client binaries
ii mariadb-client-core-10.0 10.0.14+maria-1~precise amd64 MariaDB database core client binaries
ii mariadb-common 10.0.14+maria-1~precise all MariaDB database common files (e.g. /etc/mysql/conf.d/mariadb.cnf)
ii mariadb-server 10.0.14+maria-1~precise all MariaDB database server (metapackage depending on the latest version)
ii mariadb-server-10.0 10.0.14+maria-1~precise amd64 MariaDB database server binaries
ii mariadb-server-core-10.0 10.0.14+maria-1~precise amd64 MariaDB database core server files
ii mysql-common 10.0.14+maria-1~precise all MariaDB database common files (e.g. /etc/mysql/my.cnf)
root@precise64:~# dpkg -l | grep -i mysql
ii libdbd-mysql-perl 4.025-1 amd64 Perl5 database interface to the MySQL database
ii libmysqlclient18 10.0.14+maria-1~precise amd64 Virtual package to satisfy external depends
ii mariadb-common 10.0.14+maria-1~precise all MariaDB database common files (e.g. /etc/mysql/conf.d/mariadb.cnf)
ii mysql-common 10.0.14+maria-1~precise all MariaDB database common files (e.g. /etc/mysql/my.cnf)
ii php5-mysql 5.5.9+dfsg-1ubuntu4.5 amd64 MySQL module for php5
rc php5-mysqlnd 5.3.10-1ubuntu3.11 amd64 MySQL module for php5 (Native Driver)
ii phpmyadmin 4:4.0.10-1 all MySQL web administration tool
root@precise64:~#
【问题讨论】:
-
在对表施加唯一索引之前,您是否还有一份表副本?如果是这样,请尝试
SELECT COUNT(*), language_id, parent_id, type, flag FROM original_translations GROUP BY language_id, parent_id, type, flag HAVING COUNT(*) > 1。这将识别您的唯一索引将禁止的任何重复项。 -
@OllieJones 这个查询总共返回 12.234.849 的 759 行,这对我来说很可疑,因为我可以看到更多的重复项手动浏览数据。
-
@OllieJones 和部分返回的行不重复,至少标志列通常不同
标签: mysql sql indexing mariadb