原文:
https://rsbeoriginal.medium.com/partitioning-with-postgresql-v11-6fe5388c6e98
- What — Partitioning is splitting one table into multiple smaller tables.
-
When — It is useful when we have a large table and some columns are frequently occurring in
WHEREclause when we the table is queried. Let’s supposeBooktable in library management system, so here inventory of books can be huge and will always be increasing. But, the general queries over theBooktable will basically be about the book status (borrowed/not borrowed or available/not available). Here, we observe that most queries on book table would be on attributestatus. So, it would be better to split theBooktable based on attributestatus.
Partition Key .
- Why — One most obvious benefit that we can get from partitioning is query performance, if we are able to identify partition key which is being frequently used in most queries. Other benefits could be efficient usage of memory. For example, if data is partitioned based on time or usage, then older or less used data can be migrated to cheaper or slower storage media.
Partitioning won’t help much if the partitions are highly skewed
- How — That’s what the article is all about !
- Range Partition — It can be used when we want to create partition on range of values of attribute. For example, like employee data can be partitioned based on age like 20–30, 30–40, 40–50 etc. or like medium can partition the articles based on month of publishing an article.
- status attribute in the book table can have 2 values (borrowed /not borrowed), so book table can be partitioned for each status.
- Hash Partition — It can be used to distribute the data among the partitions when we aren’t sure whether range or list partition would give us uniform distribution among partitions but we have growing data to be distributed evenly in partitions. It is done by specifying modulus and remainder for each partition. It is compatible with all data typesIn this article, we’ll be using PostgreSQL 11.
process .
process as it’s a verb
We’ll start with creating 2 tables:
-
process— Normal table -
process_partition— Partition table with partition key asstatus
process tables will contain
-
id— Auto-incremented id - name
-
status— possible values for statusOPEN,IN_PROGRESSandDONE
process table first
id)
);
process_partition table
'OPEN');
process_partition_done for DONE status.
If we want to use status as partition key, then we are forced to add it in primary key also. Similarly if we have any Unique key constraint, then we’ll need to add the partition key there also. If we don’t, then we’ll get error while creating table.
Now, we’ll add around 10000 rows first in each status of the 3 status.
END; $$
Similarly we’ll add in process_partition table also.
END; $$
process_partition_done .
status column each row is added to respective partition and total rows of master table are the aggregation of all partition tables.
So now let’s insert a new row and then try to change status and observe the behaviour.
'OPEN')
process_partition_open with id 30001
process_partition_open .
IN_PROGRESS status.
'OPEN';
process_partition_in_progress
process_partition_in_progress table. So, we observe that there is movement of rows from one partition to other when there is change in status, i.e partition key.
Below are the results for 30000 rows and when data is increased 10x times to 300000 rows.
process_partition_open table.
WHERE clause so postgres doesn’t know which partition to scan, so it scans all the partitions. This case becomes similar to unpartitioned table because in query we are not using partition key.
enable_partition_pruning
SHOW enable_partition_pruning;
enable_partition_pruning = on
process_partition_open table because it contains all rows whose status is OPEN.
enable_partition_pruning , using below statement.
SET enable_partition_pruning = off
enable_partition_pruning = off
process_partition table are scanned.
That’s it, Folks !
Sub-Partitioning and Attach/Dettach partitions
References
https://www.postgresql.org/docs/11/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-BEST-PRACTICES