-
Notifications
You must be signed in to change notification settings - Fork 68
Partitioning by expression
As of release 1.4, pg_pathman supports partitioning by expression (both RANGE and HASH partitioning schemes).
/* we're going to store something like { "key": 100, "value": "abcdef...." } */
create table test(col jsonb not null);
/* create 10 partitions, each will contain 10.000 unique keys */
select create_range_partitions('test', '(col->>''key'')::int8', 1, 10000, 10);
/* insert some data */
insert into test
select format('{"key": %s, "date": "%s", "value": "%s"}',
i, current_date, md5(i::text))::jsonb
from generate_series(1, 10000 * 10) as g(i);
Now that we've created a table, let's look at the contents of pathman_config
:
select partrel, expr, parttype, range_interval from pathman_config;
partrel | expr | parttype | range_interval
---------+---------------------------------+----------+----------------
test | ((col ->> 'key'::text))::bigint | 2 | 10000
(1 row)
As you can see, the expr
column contains a textual representation of our partitioning expression. The expression's AST is also stored in pathman_config
, but it's not supposed to be human-readable:
select cooked_expr from pathman_config;
-[ RECORD 1 ]--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
cooked_expr | {COERCEVIAIO :arg {OPEXPR :opno 3477 :opfuncid 3214 :opresulttype 25 :opretset false :opcollid 100 :inputcollid 100 :args ({VAR :varno 1 :varattno 1 :vartype 3802 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 9} {CONST :consttype 25 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false :constisnull false :location 15 :constvalue 7 [ 28 0 0 0 107 101 121 ]}) :location 12} :resulttype 20 :resultcollid 0 :coerceformat 1 :location 21}
Now let's take a glance at partitions:
table pathman_partition_list ;
parent | partition | parttype | expr | range_min | range_max
--------+-----------+----------+---------------------------------+-----------+-----------
test | test_1 | 2 | ((col ->> 'key'::text))::bigint | 1 | 10001
test | test_2 | 2 | ((col ->> 'key'::text))::bigint | 10001 | 20001
test | test_3 | 2 | ((col ->> 'key'::text))::bigint | 20001 | 30001
test | test_4 | 2 | ((col ->> 'key'::text))::bigint | 30001 | 40001
test | test_5 | 2 | ((col ->> 'key'::text))::bigint | 40001 | 50001
test | test_6 | 2 | ((col ->> 'key'::text))::bigint | 50001 | 60001
test | test_7 | 2 | ((col ->> 'key'::text))::bigint | 60001 | 70001
test | test_8 | 2 | ((col ->> 'key'::text))::bigint | 70001 | 80001
test | test_9 | 2 | ((col ->> 'key'::text))::bigint | 80001 | 90001
test | test_10 | 2 | ((col ->> 'key'::text))::bigint | 90001 | 100001
(10 rows)
Each partition has a tricky check constraint that stores partition's configuration (bounds / hash + number):
select conname, pg_get_constraintdef(oid) condef from pg_constraint where conrelid = 'test_1'::regclass;
-[ RECORD 1 ]---------------------------------------------------------------------------------------------------------------
conname | pathman_test_1_check
condef | CHECK (((((col ->> 'key'::text))::bigint >= '1'::bigint) AND (((col ->> 'key'::text))::bigint < '10001'::bigint)))
It's relatively easy to figure out whether pg_pathman is going to optimize a query... provided that expression consists of a single column:
create table abc(val int not null);
select create_hash_partitions('abc', 'val', 5); /* partition by 'val' */
explain select * from abc where val = 1;
QUERY PLAN
-------------------------------------------------------------
Append (cost=0.00..41.88 rows=13 width=4)
-> Seq Scan on abc_0 (cost=0.00..41.88 rows=13 width=4)
Filter: (val = 1)
(3 rows)
Things get much more complicated once you use an intricate composite expression (like the one we've introduced in previous sections). pg_pathman won't be able to optimize a query unless we reference the exact same expression we used to create partitions:
explain select * from test where ((col ->> 'key'::text))::int = 4;
QUERY PLAN
-----------------------------------------------------------------
Append (cost=0.00..6620.00 rows=500 width=32)
-> Seq Scan on test_1 (cost=0.00..662.00 rows=50 width=91)
Filter: (((col ->> 'key'::text))::integer = 4)
-> Seq Scan on test_2 (cost=0.00..662.00 rows=50 width=92)
Filter: (((col ->> 'key'::text))::integer = 4)
-> Seq Scan on test_3 (cost=0.00..662.00 rows=50 width=92)
Filter: (((col ->> 'key'::text))::integer = 4)
-> Seq Scan on test_4 (cost=0.00..662.00 rows=50 width=92)
Filter: (((col ->> 'key'::text))::integer = 4)
-> Seq Scan on test_5 (cost=0.00..662.00 rows=50 width=92)
Filter: (((col ->> 'key'::text))::integer = 4)
-> Seq Scan on test_6 (cost=0.00..662.00 rows=50 width=92)
Filter: (((col ->> 'key'::text))::integer = 4)
-> Seq Scan on test_7 (cost=0.00..662.00 rows=50 width=92)
Filter: (((col ->> 'key'::text))::integer = 4)
-> Seq Scan on test_8 (cost=0.00..662.00 rows=50 width=92)
Filter: (((col ->> 'key'::text))::integer = 4)
-> Seq Scan on test_9 (cost=0.00..662.00 rows=50 width=92)
Filter: (((col ->> 'key'::text))::integer = 4)
-> Seq Scan on test_10 (cost=0.00..662.00 rows=50 width=92)
Filter: (((col ->> 'key'::text))::integer = 4)
(21 rows)
Something is wrong, isn't int? Looks like we used the wrong expression:
select expr from pathman_config where partrel = 'test'::regclass;
expr
---------------------------------
((col ->> 'key'::text))::bigint /* we used ((col ->> 'key'::text))::integer */
(1 row)
Everything works just as expected once we change the expression:
explain select * from test where ((col ->> 'key'::text))::bigint = 4;
QUERY PLAN
----------------------------------------------------------------
Append (cost=0.00..662.00 rows=50 width=32)
-> Seq Scan on test_1 (cost=0.00..662.00 rows=50 width=91)
Filter: (((col ->> 'key'::text))::bigint = 4)
(3 rows)
There are several limitations:
- Expression must reference at least one column of partitioned table;
- All referenced columns must be marked NOT NULL;
- Expression may not reference system attributes (oid, xmin, xmax etc);
- Expression may not include sub-queries;
- All functions used by expression must be marked IMMUTABLE;