【问题标题】:What is the the best way to index a large mysql table?索引大型mysql表的最佳方法是什么?
【发布时间】:2018-01-16 17:53:18
【问题描述】:

这是表的创建语句:

CREATE TABLE `inodes_data` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `I_ID` int(11) unsigned NOT NULL DEFAULT '0',
  `Time` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `Stored` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  `dataIndex` int(11) unsigned DEFAULT NULL,
  `memoryAddress` int(11) DEFAULT NULL,
  `Sens1` float DEFAULT NULL,
  `Sens2` float DEFAULT NULL,
  `Sens3` float DEFAULT NULL,
  `Sens4` float DEFAULT NULL,
  `rawData` char(50) NOT NULL,
  PRIMARY KEY (`id`,`Time`,`I_ID`),
  UNIQUE KEY `i_id_time_idx` (`I_ID`,`Time`),
  KEY `I_ID` (`I_ID`),
  KEY `IX_TIME` (`Time`)
) ENGINE=InnoDB AUTO_INCREMENT=8289060 DEFAULT CHARSET=latin1 AVG_ROW_LENGTH=83
/*!50100 PARTITION BY RANGE (TO_DAYS(Time))
(PARTITION p2011_Prior VALUES LESS THAN (0) ENGINE = InnoDB,
 PARTITION p2011_01 VALUES LESS THAN (734534) ENGINE = InnoDB,
 PARTITION p2011_02 VALUES LESS THAN (734562) ENGINE = InnoDB,
 PARTITION p2011_03 VALUES LESS THAN (734593) ENGINE = InnoDB,
 PARTITION p2011_04 VALUES LESS THAN (734623) ENGINE = InnoDB,
 PARTITION p2011_05 VALUES LESS THAN (734654) ENGINE = InnoDB,
 PARTITION p2011_06 VALUES LESS THAN (734684) ENGINE = InnoDB,
 PARTITION p2011_07 VALUES LESS THAN (734715) ENGINE = InnoDB,
 PARTITION p2011_08 VALUES LESS THAN (734746) ENGINE = InnoDB,
 PARTITION p2011_09 VALUES LESS THAN (734776) ENGINE = InnoDB,
 PARTITION p2011_10 VALUES LESS THAN (734807) ENGINE = InnoDB,
 PARTITION p2011_11 VALUES LESS THAN (734837) ENGINE = InnoDB,
 PARTITION p2011_12 VALUES LESS THAN (734868) ENGINE = InnoDB,
 PARTITION p2012_01 VALUES LESS THAN (734899) ENGINE = InnoDB,
 PARTITION p2012_02 VALUES LESS THAN (734928) ENGINE = InnoDB,
 PARTITION p2012_03 VALUES LESS THAN (734959) ENGINE = InnoDB,
 PARTITION p2012_04 VALUES LESS THAN (734989) ENGINE = InnoDB,
 PARTITION p2012_05 VALUES LESS THAN (735020) ENGINE = InnoDB,
 PARTITION p2012_06 VALUES LESS THAN (735050) ENGINE = InnoDB,
 PARTITION p2012_07 VALUES LESS THAN (735081) ENGINE = InnoDB,
 PARTITION p2012_08 VALUES LESS THAN (735112) ENGINE = InnoDB,
 PARTITION p2012_09 VALUES LESS THAN (735142) ENGINE = InnoDB,
 PARTITION p2012_10 VALUES LESS THAN (735173) ENGINE = InnoDB,
 PARTITION p2012_11 VALUES LESS THAN (735203) ENGINE = InnoDB,
 PARTITION p2012_12 VALUES LESS THAN (735234) ENGINE = InnoDB,
 PARTITION p2013_01 VALUES LESS THAN (735265) ENGINE = InnoDB,
 PARTITION p2013_02 VALUES LESS THAN (735293) ENGINE = InnoDB,
 PARTITION p2013_03 VALUES LESS THAN (735324) ENGINE = InnoDB,
 PARTITION p2013_04 VALUES LESS THAN (735354) ENGINE = InnoDB,
 PARTITION p2013_05 VALUES LESS THAN (735385) ENGINE = InnoDB,
 PARTITION p2013_06 VALUES LESS THAN (735415) ENGINE = InnoDB,
 PARTITION p2013_07 VALUES LESS THAN (735446) ENGINE = InnoDB,
 PARTITION p2013_08 VALUES LESS THAN (735477) ENGINE = InnoDB,
 PARTITION p2013_09 VALUES LESS THAN (735507) ENGINE = InnoDB,
 PARTITION p2013_10 VALUES LESS THAN (735538) ENGINE = InnoDB,
 PARTITION p2013_11 VALUES LESS THAN (735568) ENGINE = InnoDB,
 PARTITION p2013_12 VALUES LESS THAN (735599) ENGINE = InnoDB,
 PARTITION p2014_01 VALUES LESS THAN (735630) ENGINE = InnoDB,
 PARTITION p2014_02 VALUES LESS THAN (735658) ENGINE = InnoDB,
 PARTITION p2014_03 VALUES LESS THAN (735689) ENGINE = InnoDB,
 PARTITION p2014_04 VALUES LESS THAN (735719) ENGINE = InnoDB,
 PARTITION p2014_05 VALUES LESS THAN (735750) ENGINE = InnoDB,
 PARTITION p2014_06 VALUES LESS THAN (735780) ENGINE = InnoDB,
 PARTITION p2014_07 VALUES LESS THAN (735811) ENGINE = InnoDB,
 PARTITION p2014_08 VALUES LESS THAN (735842) ENGINE = InnoDB,
 PARTITION p2014_09 VALUES LESS THAN (735872) ENGINE = InnoDB,
 PARTITION p2014_10 VALUES LESS THAN (735903) ENGINE = InnoDB,
 PARTITION p2014_11 VALUES LESS THAN (735933) ENGINE = InnoDB,
 PARTITION p2014_12 VALUES LESS THAN (735964) ENGINE = InnoDB,
 PARTITION p2015_01 VALUES LESS THAN (735995) ENGINE = InnoDB,
 PARTITION p2015_02 VALUES LESS THAN (736023) ENGINE = InnoDB,
 PARTITION p2015_03 VALUES LESS THAN (736054) ENGINE = InnoDB,
 PARTITION p2015_04 VALUES LESS THAN (736084) ENGINE = InnoDB,
 PARTITION p2015_05 VALUES LESS THAN (736115) ENGINE = InnoDB,
 PARTITION p2015_06 VALUES LESS THAN (736145) ENGINE = InnoDB,
 PARTITION p2015_07 VALUES LESS THAN (736176) ENGINE = InnoDB,
 PARTITION p2015_08 VALUES LESS THAN (736207) ENGINE = InnoDB,
 PARTITION p2015_09 VALUES LESS THAN (736237) ENGINE = InnoDB,
 PARTITION p2015_10 VALUES LESS THAN (736268) ENGINE = InnoDB,
 PARTITION p2015_11 VALUES LESS THAN (736298) ENGINE = InnoDB,
 PARTITION p2015_12 VALUES LESS THAN (736329) ENGINE = InnoDB,
 PARTITION p2016_01 VALUES LESS THAN (736360) ENGINE = InnoDB,
 PARTITION p2016_02 VALUES LESS THAN (736389) ENGINE = InnoDB,
 PARTITION p2016_03 VALUES LESS THAN (736420) ENGINE = InnoDB,
 PARTITION p2016_04 VALUES LESS THAN (736450) ENGINE = InnoDB,
 PARTITION p2016_05 VALUES LESS THAN (736481) ENGINE = InnoDB,
 PARTITION p2016_06 VALUES LESS THAN (736511) ENGINE = InnoDB,
 PARTITION p2016_07 VALUES LESS THAN (736542) ENGINE = InnoDB,
 PARTITION p2016_08 VALUES LESS THAN (736573) ENGINE = InnoDB,
 PARTITION p2016_09 VALUES LESS THAN (736603) ENGINE = InnoDB,
 PARTITION p2016_10 VALUES LESS THAN (736634) ENGINE = InnoDB,
 PARTITION p2016_11 VALUES LESS THAN (736664) ENGINE = InnoDB,
 PARTITION p2016_12 VALUES LESS THAN (736695) ENGINE = InnoDB,
 PARTITION p2017_01 VALUES LESS THAN (736726) ENGINE = InnoDB,
 PARTITION p2017_02 VALUES LESS THAN (736754) ENGINE = InnoDB,
 PARTITION p2017_03 VALUES LESS THAN (736785) ENGINE = InnoDB,
 PARTITION p2017_04 VALUES LESS THAN (736815) ENGINE = InnoDB,
 PARTITION p2017_05 VALUES LESS THAN (736846) ENGINE = InnoDB,
 PARTITION p2017_06 VALUES LESS THAN (736876) ENGINE = InnoDB,
 PARTITION p2017_07 VALUES LESS THAN (736907) ENGINE = InnoDB,
 PARTITION p2017_08 VALUES LESS THAN (736938) ENGINE = InnoDB,
 PARTITION p2017_09 VALUES LESS THAN (736968) ENGINE = InnoDB,
 PARTITION p2017_10 VALUES LESS THAN (736999) ENGINE = InnoDB,
 PARTITION p2017_11 VALUES LESS THAN (737029) ENGINE = InnoDB,
 PARTITION p2017_12 VALUES LESS THAN (737060) ENGINE = InnoDB,
 PARTITION p2018_01 VALUES LESS THAN (737091) ENGINE = InnoDB,
 PARTITION p2018_02 VALUES LESS THAN (737119) ENGINE = InnoDB,
 PARTITION p2018_03 VALUES LESS THAN (737150) ENGINE = InnoDB,
 PARTITION p2018_04 VALUES LESS THAN (737180) ENGINE = InnoDB,
 PARTITION p2018_05 VALUES LESS THAN (737211) ENGINE = InnoDB,
 PARTITION p2018_06 VALUES LESS THAN (737241) ENGINE = InnoDB,
 PARTITION p2018_07 VALUES LESS THAN (737272) ENGINE = InnoDB,
 PARTITION p2018_08 VALUES LESS THAN (737303) ENGINE = InnoDB,
 PARTITION p2018_09 VALUES LESS THAN (737333) ENGINE = InnoDB,
 PARTITION p2018_10 VALUES LESS THAN (737364) ENGINE = InnoDB,
 PARTITION p2018_11 VALUES LESS THAN (737394) ENGINE = InnoDB,
 PARTITION p2018_12 VALUES LESS THAN (737425) ENGINE = InnoDB,
 PARTITION p2019_01 VALUES LESS THAN (737456) ENGINE = InnoDB,
 PARTITION p2019_02 VALUES LESS THAN (737484) ENGINE = InnoDB,
 PARTITION p2019_03 VALUES LESS THAN (737515) ENGINE = InnoDB,
 PARTITION p2019_04 VALUES LESS THAN (737545) ENGINE = InnoDB,
 PARTITION p2019_05 VALUES LESS THAN (737576) ENGINE = InnoDB,
 PARTITION p2019_06 VALUES LESS THAN (737606) ENGINE = InnoDB,
 PARTITION p2019_07 VALUES LESS THAN (737637) ENGINE = InnoDB,
 PARTITION p2019_08 VALUES LESS THAN (737668) ENGINE = InnoDB,
 PARTITION p2019_09 VALUES LESS THAN (737698) ENGINE = InnoDB,
 PARTITION p2019_10 VALUES LESS THAN (737729) ENGINE = InnoDB,
 PARTITION p2019_11 VALUES LESS THAN (737759) ENGINE = InnoDB,
 PARTITION p2019_12 VALUES LESS THAN (737790) ENGINE = InnoDB,
 PARTITION pUnknown VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */;

针对此表运行的最常见查询是:

SELECT  *
    FROM  inodes_data
    WHERE  I_ID = xxxxx
      AND  Time BETWEEN 'xxxx-xx-xx xx:xx:xx' AND 'xxxx-xx-xx xx:xx:xx';

SELECT  *
    FROM  inodes_data
    WHERE  I_ID IN (xxxxx,xxxx,....)
      AND  Time BETWEEN 'xxxx-xx-xx xx:xx:xx' AND 'xxxx-xx-xx xx:xx:xx';

目前,从该表中查询数据需要很长时间。如果我尝试拉一年,它可能需要 15 秒甚至几分钟。我已经研究过并且正在努力寻找一种方法来加快速度。有没有更好的方法来设置索引,如果有,有人可以告诉我为什么!感谢您的帮助。

【问题讨论】:

  • BETWEEN AND 即在哪里。 在 xxx 和 yyy 之间 ??
  • 不管怎样,WHERE 子句的分区表可能比普通表慢很多。查询计划器有时必须单独搜索所有分区。由于表中的行数少于 20 或 3000 万,分区通常弊大于利,尤其是对于最新的 MySQL 版本和 21 世纪的服务器硬件。 83 字节乘以 1000 万行小于 100 兆字节。即使将 dbms 开销乘以 10,这也不算多。
  • @BerndBuffen 抱歉,我只是将它们排除在外。我们在查询中使用它们。

标签: mysql indexing query-performance database-partitioning


【解决方案1】:

我认为 MySQL 会使用您的 I_IDTime 索引。为了确保您可以在查询开始时使用EXPLAIN 来查看 MySQL 的查询计划实际上是什么。 (https://dev.mysql.com/doc/refman/5.7/en/explain.html)

ie: EXPLAIN SELECT * FROM inodes_data WHERE I_ID = xxxxx AND Time BETWEEN 'xxxx-xx-xx xx:xx:xx' AND 'yyyy-yy-yy yy:yy:yy'; // 添加了AND 声明

更多信息:EXPLAIN EXTENDED SELECT * FROM inodes_data WHERE I_ID = xxxxx AND Time BETWEEN 'xxxx-xx-xx xx:xx:xx' AND 'yyyy-yy-yy yy:yy:yy'; // 添加了AND 声明。

此外,您可以使用I_IDTime 创建一个复合索引,如果您始终使用两列进行查询,这可能会创建一个更好的索引。您将更改您的更改以添加另一个索引: - 错过了已经是多列索引的现有 UNIQUE

(https://dev.mysql.com/doc/refman/5.7/en/multiple-column-indexes.html)

【讨论】:

  • 去测试复合索引,如果我看到改进,请告诉你。它不会与此处创建表中已声明的内容相同:UNIQUE KEY i_id_time_idx (I_ID,Time),
  • UNIQUE一个索引;不要多余。没有前缀:(I_ID, Time) 照顾 (I_ID)
  • 我不知道我怎么错过了UNIQUE 索引,这肯定比我建议的INDEX 更好。无论哪种方式,EXPLAIN 语法都应该引导您找出导致缓慢的原因。请注意,您应该将每行中的每个 row 值相乘以确定 MySQL 正在操作的总行数,如果您在 row 列中看到很大的数字,则通常表明您的查询存在严重问题。跨度>
【解决方案2】:

你真的在做SELECT *吗?那将返回多少行?网络时间不是真正的问题吗?

如果你真的在总结事情,让我们看看GROUP BY等。它可以对如何回答你的问题产生很大的影响。

与此同时,...

摆脱PK,将UNIQUE KEY (I_ID,Time)提升为PK。这将使第一个查询以最佳方式工作。事实上,它会明显更好地工作 PARTITIONing

分区数不要超过 50 个。您拥有的分区越多,您支付的开销就越多。

在需要之前不要添加分区。再次,“开销”。保留“未来”分区并在添加新分区时使用REORGANIZE PARTITION。更多讨论请见here

在适用的情况下缩小INTs(例如SMALLINT UNSIGNED)。

去掉多余的KEY (I_ID)。它正在浪费磁盘空间并减慢速度INSERTs

摆脱time 上的索引——使用分区键开始的索引几乎总是低效的。

由于第二个查询请求二维索引,我不建议删除分区。这是PARTITION 的少数用途之一。

除非文本确实是固定长度,否则不要说CHAR(50)。 (不,使行“固定长度”没有帮助。)

建议在时间范围内使用此构造:

 WHERE Time >= '2017-02-01`
   AND Time  < '2017-02-01` + INTERVAL 5 MONTH

总结:

  • 清理分区(更少、没有未来等)
  • 折腾id
  • CHAR -> VARCHAR
  • 将 4 个索引更改为 1:PRIMARY KEY(I_ID, time)

【讨论】:

  • 我会给这些建议,让你在大学里好好尝试一下,然后再回复你。非常感谢您的回复。至于“GROUP BY”,这是时间序列数据,所以分组不是一回事。
  • 每天不做平均数之类的事情?
  • 我想可能会不时出现这种情况,但出于向客户端显示时间序列的原始目的,这是工作量的 99.9%。实际上,我只是迷失在一堆数据中,一直无法弄清楚如何解决它。
  • 这个表大约有500M行。只需选择计数需要 AGES
  • 一张图不能有效地使用超过几百个点。然而你正在获取一百万行?我的观点是,有一些方法可以构建图表而不获取“太多”行,从而保持快速查询。请在您的问题中添加对目标的描述;我想我看到了一种使查询运行速度提高 10 的方法。部分描述需要详细说明查询中有多少天以及映射到多少行。
【解决方案3】:

建立和维护两个“汇总表”。每张桌子都有

Time (truncated to day for one table, hour for the other)
I_ID
3 columns for each sensor
miscellany

一个传感器的列:

average for the day (or hour)
min
max

取决于时间范围...

  • 对于不到一周的范围,使用原始数据,就像现在一样。 (我们可能需要重新解决分区和索引问题以使这项工作更好。)
  • 对于 1 周到 6 个月,使用每小时汇总表
  • 6 个月以上,使用每日汇总表。

这样,您至少获得 100 分。调整截断值以在查询速度与图表中的详细信息之间进行权衡。

如果您愿意,您可以显示一个垂直条而不是一个点 - 这可以让您表示在一小时或一天内有多少变化。如果用户想要更多细节,可以放大。

汇总表可以是非分区的,有

PRIMARY KEY(I_ID, Time),
INDEX(Time)

关于汇总表的更多信息:http://mysql.rjweb.org/doc.php/summarytables

对于表的维护,请执行以下操作:在每个您的末尾,运行如下查询:

 INSERT INTO Hourly (...)
     SELECT FLOOR(Time / 3600) AS the_hour,
            I_ID, ...
            AVG(sensor1),
            MIN(sensor1),
            MAX(sensor1),
            ...
        FROM ...
        WHERE ...  -- just the one hour
        GROUP BY the_hour, I_ID;

在每天结束时,将每小时汇总到每天。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2011-03-04
    • 2015-04-08
    • 1970-01-01
    • 1970-01-01
    • 2016-12-10
    • 1970-01-01
    相关资源
    最近更新 更多