【发布时间】:2018-01-15 15:12:44
【问题描述】:
我是 SQL 新手,我有一个问题。
我有这个 SQL 代码:
DROP TABLE if exists students;
DROP TABLE if exists grades;
CREATE TABLE students(
s_id integer NOT NULL PRIMARY KEY,
s_name text,
s_last_name text,
curr_year integer
);
CREATE TABLE grades(
s_id integer NOT NULL PRIMARY KEY,
course text,
c_year integer,
grade integer,
FOREIGN KEY (s_id) REFERENCES students
);
INSERT INTO students (s_id, s_name, s_last_name, curr_year)
VALUES (1, 'A', 'S', 3);
INSERT INTO students (s_id, s_name, s_last_name, curr_year)
VALUES (2, 'A', 'A', 2);
INSERT INTO students (s_id, s_name, s_last_name, curr_year)
VALUES (3, 'V', 'B', 1);
INSERT INTO students (s_id, s_name, s_last_name, curr_year)
VALUES (4, 'K', 'N', 2);
INSERT INTO grades (s_id, course, c_year, grade)
VALUES (1, 'DB', 2, 98);
INSERT INTO grades (s_id, course, c_year, grade)
VALUES (2, 'OS', 3, 90);
INSERT INTO grades (s_id, course, c_year, grade)
VALUES (3, 'DB', 2, 94);
EXPLAIN ANALYZE
SELECT *
FROM students s JOIN grades gr
ON s.s_id = gr.s_id
WHERE curr_year > 0;
CREATE INDEX student_details ON students(s_id, s_name, s_last_name);
CREATE INDEX student_courses ON grades(s_id, course);
EXPLAIN ANALYZE
SELECT *
FROM students s JOIN grades gr
ON s.s_id = gr.s_id
WHERE curr_year > 0;
DROP INDEX student_details;
DROP INDEX student_courses;
DROP TABLE students CASCADE;
DROP TABLE grades CASCADE;
我尝试理解解释输出。 在创建索引之前,我得到了哈希合并。这是解释输出:
Hash Join (cost=23.50..51.74 rows=270 width=116) (actual time=0.039..0.050 rows=3 loops=1)
Hash Cond: (gr.s_id = s.s_id)
-> Seq Scan on grades gr (cost=0.00..21.30 rows=1130 width=44) (actual time=0.005..0.008 rows=3 loops=1)
-> Hash (cost=20.12..20.12 rows=270 width=72) (actual time=0.021..0.021 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on students s (cost=0.00..20.12 rows=270 width=72) (actual time=0.006..0.011 rows=4 loops=1)
Filter: (curr_year > 0)
Planning time: 0.147 ms
Execution time: 0.089 ms
(9 rows)
在创建索引后,我得到了嵌套循环:
Nested Loop (cost=0.00..2.12 rows=1 width=116) (actual time=0.031..0.116 rows=3 loops=1)
Join Filter: (s.s_id = gr.s_id)
Rows Removed by Join Filter: 9
-> Seq Scan on students s (cost=0.00..1.05 rows=1 width=72) (actual time=0.012..0.018 rows=4 loops=1)
Filter: (curr_year > 0)
-> Seq Scan on grades gr (cost=0.00..1.03 rows=3 width=44) (actual time=0.003..0.009 rows=3 loops=4)
Planning time: 0.396 ms
Execution time: 0.170 ms
但我不知道为什么?为什么我的情况下的索引使嵌套循环比散列连接更受欢迎? 我会很高兴得到解释。
非常感谢!
【问题讨论】:
-
对于只有少量记录的表,优化根本没有意义。提示:计划者已经花费了比执行计划更多的时间来生成计划。
-
如果这是基于真实世界的示例,那么这些索引毫无意义。两个表中的
sid上已经有一个聚集索引,因此在索引的开头包含此列将使索引无用。
标签: sql postgresql query-performance