【问题标题】:MySQL Table Index CardinalityMySQL 表索引基数
【发布时间】:2016-07-11 08:33:04
【问题描述】:

我有一个非常大的 MySQL 数据库表,有 250 万行并且还在增长。为了加快查询速度,我向其中一列添加了索引。当我手动设置索引时,例如通过 PHPMyAdmin,它的基数大约是 1500,这似乎是正确的,我的查询运行没有问题。

在运行了几个查询(尤其是在 INSERT 上但不限于)之后,问题就出现了,该索引的基数下降到 17 或 18,并且查询运行非常缓慢。有时它似乎可以回到 1500 左右,或者我必须再次通过 PHPMyAdmin 来完成。

有没有办法阻止这种基数下降的发生?

CREATE TABLE IF NOT EXISTS `probe_results` (
  `probe_result_id` int(11) NOT NULL AUTO_INCREMENT,
  `date` date NOT NULL,
  `month` int(11) NOT NULL,
  `year` int(11) NOT NULL,
  `time` time NOT NULL,
  `type` varchar(11) NOT NULL,
  `probe_id` varchar(50) NOT NULL,
  `status` varchar(11) NOT NULL,
  `temp_1` decimal(11,0) NOT NULL,
  `temp_2` decimal(11,0) NOT NULL,
  `crc` varchar(11) NOT NULL,
  `raw_data` text NOT NULL,
  `txt_file` text NOT NULL,
  PRIMARY KEY (`probe_result_id`),
  KEY `probe_id` (`probe_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 AUTO_INCREMENT=2527300 ;

'probe_result_id' 列是主键,probe_id 是有问题索引的列。

查询示例:

SELECT IF(b.reactive_total IS NULL, 0, b.reactive_total) AS reactive_total, a.* FROM (SELECT COUNT(CASE WHEN asset_testing_results.asset_testing_year = '2016' AND asset_testing_results.asset_testing_month = '7' AND asset_testing_results.asset_stopped = '0' AND asset_testing_results.asset_testing_completed = '0' THEN 1 END) AS due_total, (COUNT(CASE WHEN asset_testing_results.asset_testing_year = '2016' AND asset_testing_results.asset_stopped = '0' AND asset_testing_results.asset_testing_completed = '1' AND asset_testing_results.asset_testing_satisfactory = '1' AND asset_testing_results.asset_testing_actioned = '0' THEN 1 END)+(IF(probes_passed_total IS NULL, 0, probes_passed_total))) AS passed_total, (COUNT(CASE WHEN asset_testing_results.asset_testing_year = '2016' AND asset_testing_results.asset_stopped = '0' AND asset_testing_results.asset_testing_completed = '1' AND asset_testing_results.asset_testing_satisfactory = '0' AND asset_testing_results.asset_testing_actioned = '0' THEN 1 END)+(IF(probes_failed_total IS NULL, 0, probes_failed_total))) AS failed_total, COUNT(CASE WHEN asset_testing_results.asset_testing_year = '2016' AND asset_testing_results.asset_stopped = '0' AND asset_testing_results.asset_testing_completed = '1' AND asset_testing_results.asset_testing_actioned = '1' THEN 1 END) AS actioned_total, COUNT(CASE WHEN asset_testing_results.asset_testing_year = '2016' AND asset_testing_results.asset_testing_month < '7' AND asset_testing_results.asset_testing_completed = '0' AND asset_testing_results.asset_testing_satisfactory = '0' AND asset_testing_results.asset_stopped = '0' THEN 1 END) AS missed_total, site.site_key, site.site_name FROM site LEFT JOIN location ON location.site_key = site.site_key LEFT JOIN sub_location ON sub_location.location_key = location.location_key LEFT JOIN asset ON asset.sub_location_key = sub_location.sub_location_key AND asset.stopped = '0' LEFT JOIN asset_testing ON asset_testing.asset_type_key = asset.asset_type_key AND asset_testing.probe_assessed = '0' LEFT JOIN asset_testing_results ON asset_testing_results.asset_testing_key = asset_testing.asset_testing_key AND asset_testing_results.asset_key = asset.asset_key LEFT JOIN (SELECT site.site_key, COUNT(CASE WHEN p.probe_id IS NOT NULL AND p.asset_testing_key IS NOT NULL THEN 1 END) AS probes_passed_total, COUNT(CASE WHEN p.probe_id IS NOT NULL AND p.asset_testing_key IS NULL AND p.temp_1 IS NOT NULL THEN 1 END) AS probes_failed_total FROM assetsvs_probes LEFT JOIN (SELECT q.probe_id, q.month, q.year, IF(r.temp_1 IS NULL, q.temp_1, r.temp_1) as temp_1, r.asset_testing_key FROM (SELECT DISTINCT probe_results.probe_id, probe_results.month, probe_results.year, probe_results.temp_1 FROM probe_results LEFT JOIN assetsvs_probes ON assetsvs_probes.probe_id = probe_results.probe_id LEFT JOIN asset ON asset.asset_key = assetsvs_probes.asset_key LEFT JOIN sub_location ON sub_location.sub_location_key = asset.sub_location_key LEFT JOIN location ON location.location_key = sub_location.location_key LEFT JOIN site ON site.site_key = location.site_key WHERE site.client_key = '25')q LEFT JOIN (SELECT probe_results.month, probe_results.year, probe_results.probe_id, temp_1, asset_testing.asset_testing_key FROM probe_results LEFT JOIN assetsvs_probes ON assetsvs_probes.probe_id = probe_results.probe_id LEFT JOIN asset_testing ON asset_testing.asset_testing_key = assetsvs_probes.asset_testing_key LEFT JOIN asset ON asset.asset_key = assetsvs_probes.asset_key LEFT JOIN sub_location ON sub_location.sub_location_key = asset.sub_location_key LEFT JOIN location ON location.location_key = sub_location.location_key LEFT JOIN site ON site.site_key = location.site_key WHERE temp_1 != 'invalid' AND ((temp_1 >= test_min AND test_max = '') OR (temp_1 <= test_max AND test_min = '') OR (temp_1 >= test_min AND temp_1 <= test_max)) AND year = '2016' AND site.client_key = '25' GROUP BY probe_results.month, probe_results.year, probe_results.probe_id)r ON r.probe_id = q.probe_id AND r.month = q.month AND r.year = q.year WHERE q.year = '2016' GROUP BY probe_id, month, year) p ON p.probe_id = assetsvs_probes.probe_id LEFT JOIN asset_testing ON asset_testing.asset_testing_key = assetsvs_probes.asset_testing_key LEFT JOIN asset ON asset.asset_key = assetsvs_probes.asset_key LEFT JOIN sub_location ON sub_location.sub_location_key = asset.sub_location_key LEFT JOIN location ON location.location_key = sub_location.location_key LEFT JOIN site ON site.site_key = location.site_key GROUP BY site.site_key) probe_results ON probe_results.site_key = site.site_key WHERE site.client_key = '25' GROUP BY site.site_key)a LEFT JOIN (SELECT COUNT(CASE WHEN jobs.status = '3' THEN 1 END) AS reactive_total, site.site_key FROM jobs LEFT JOIN jobs_meta ON jobs_meta.job_id = jobs.job_id AND jobs_meta.meta_key = 'start_date' LEFT JOIN site ON site.site_key = jobs.site_key WHERE site.client_key = '25' AND jobs_meta.meta_value LIKE '%/2016 %' GROUP BY site.site_key)b ON b.site_key = a.site_key

谢谢

【问题讨论】:

  • 在您的查询中也使用EXPLAIN,看看会告诉您什么。
  • 表格共有13列。唯一 ID 列上有一个主键,然后这第二个索引似乎导致了问题。
  • 好的,您能否编辑您的问题并输出表格列属性的CREATE TABLE &lt;name&gt; 列表。指明您想要索引的位置?
  • 问题已编辑...
  • 纯粹的一边;你意识到你的 monthyear 值是 11 位数吗?这将花费相当多的存储空间,并且可能不会被使用?与您的主键列相同,它最多可以使用 11 位数字,对于数百万(7 位)的列表来说,这是相当多的(x 1000 大表大小)

标签: mysql sql indexing cardinality


【解决方案1】:

基数(以及其他统计数据)由 MySQL 自动计算和更新,因此您没有直接的方法来防止它下降。

但是,您可以采取一些步骤来降低这种情况发生的可能性或纠正这种行为。

首先,如果您运行analyze table 命令,MySQL 会更新所有支持的表引擎的索引统计信息。

对于 innodb 表引擎,MySQL 提供了一组可以影响采样行为的配置设置。 MySQL 文档中描述了这些设置及其效果:

主要设置是innodb_stats_transient_sample_pages:

• 像 1 或 2 这样的小值可能会导致不准确的估计 基数。

• 增加 innodb_stats_transient_sample_pages 值可能 需要更多的磁盘读取。远大于 8(比如 100)的值,可以 导致打开表所需的时间显着减慢或 执行显示表状态。

• 优化器可能会根据具体情况选择非常不同的查询计划 指数选择性的不同估计

对于 myisam,MySQL 不提供如此多样的设置。 myisam_stats_method 设置在the general index statistics documentation中描述

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2011-01-19
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2011-03-18
    • 2014-08-01
    • 2017-04-09
    相关资源
    最近更新 更多