Skip to content

Partitioning by expression

Dmitry Ivanov edited this page Jun 6, 2017 · 3 revisions

As of release 1.4, pg_pathman supports partitioning by expression (both RANGE and HASH partitioning schemes).

Prerequisites

/* we're going to store something like { "key": 100, "value": "abcdef...." } */
create table test(col jsonb not null);

/* create 10 partitions, each will contain 10.000 unique keys */
select create_range_partitions('test', '(col->>''key'')::int8', 1, 10000, 10);

/* insert some data */
insert into test
select format('{"key": %s, "date": "%s", "value": "%s"}',
              i, current_date, md5(i::text))::jsonb
from generate_series(1, 10000 * 10) as g(i);

Expression's internals

Now that we've created a table, let's look at the contents of pathman_config:

select partrel, expr, parttype, range_interval from pathman_config;
 partrel |              expr               | parttype | range_interval
---------+---------------------------------+----------+----------------
 test    | ((col ->> 'key'::text))::bigint |        2 | 10000
(1 row)

As you can see, the expr column contains a textual representation of our partitioning expression. The expression's AST is also stored in pathman_config, but it's not supposed to be human-readable:

select cooked_expr from pathman_config;
-[ RECORD 1 ]--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
cooked_expr | {COERCEVIAIO :arg {OPEXPR :opno 3477 :opfuncid 3214 :opresulttype 25 :opretset false :opcollid 100 :inputcollid 100 :args ({VAR :varno 1 :varattno 1 :vartype 3802 :vartypmod -1 :varcollid 0 :varlevelsup 0 :varnoold 1 :varoattno 1 :location 9} {CONST :consttype 25 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false :constisnull false :location 15 :constvalue 7 [ 28 0 0 0 107 101 121 ]}) :location 12} :resulttype 20 :resultcollid 0 :coerceformat 1 :location 21}

Partitions

Now let's take a glance at partitions:

table pathman_partition_list ;
 parent | partition | parttype |              expr               | range_min | range_max
--------+-----------+----------+---------------------------------+-----------+-----------
 test   | test_1    |        2 | ((col ->> 'key'::text))::bigint | 1         | 10001
 test   | test_2    |        2 | ((col ->> 'key'::text))::bigint | 10001     | 20001
 test   | test_3    |        2 | ((col ->> 'key'::text))::bigint | 20001     | 30001
 test   | test_4    |        2 | ((col ->> 'key'::text))::bigint | 30001     | 40001
 test   | test_5    |        2 | ((col ->> 'key'::text))::bigint | 40001     | 50001
 test   | test_6    |        2 | ((col ->> 'key'::text))::bigint | 50001     | 60001
 test   | test_7    |        2 | ((col ->> 'key'::text))::bigint | 60001     | 70001
 test   | test_8    |        2 | ((col ->> 'key'::text))::bigint | 70001     | 80001
 test   | test_9    |        2 | ((col ->> 'key'::text))::bigint | 80001     | 90001
 test   | test_10   |        2 | ((col ->> 'key'::text))::bigint | 90001     | 100001
(10 rows)

Each partition has a tricky check constraint that stores partition's configuration (bounds / hash + number):

select conname, pg_get_constraintdef(oid) condef from pg_constraint where conrelid = 'test_1'::regclass;
-[ RECORD 1 ]---------------------------------------------------------------------------------------------------------------
conname | pathman_test_1_check
condef  | CHECK (((((col ->> 'key'::text))::bigint >= '1'::bigint) AND (((col ->> 'key'::text))::bigint < '10001'::bigint)))

Sample queries

It's relatively easy to figure out whether pg_pathman is going to optimize a query... provided that expression consists of a single column:

create table abc(val int not null);
select create_hash_partitions('abc', 'val', 5); /* partition by 'val' */

explain select * from abc where val = 1;
                         QUERY PLAN
-------------------------------------------------------------
 Append  (cost=0.00..41.88 rows=13 width=4)
   ->  Seq Scan on abc_0  (cost=0.00..41.88 rows=13 width=4)
         Filter: (val = 1)
(3 rows)

Things get much more complicated once you use an intricate composite expression (like the one we've introduced in previous sections). pg_pathman won't be able to optimize a query unless we reference the exact same expression we used to create partitions:

explain select * from test where ((col ->> 'key'::text))::int = 4;
                           QUERY PLAN
-----------------------------------------------------------------
 Append  (cost=0.00..6620.00 rows=500 width=32)
   ->  Seq Scan on test_1  (cost=0.00..662.00 rows=50 width=91)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_2  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_3  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_4  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_5  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_6  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_7  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_8  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_9  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
   ->  Seq Scan on test_10  (cost=0.00..662.00 rows=50 width=92)
         Filter: (((col ->> 'key'::text))::integer = 4)
(21 rows)

Something is wrong, isn't int? Looks like we used the wrong expression:

select expr from pathman_config where partrel = 'test'::regclass;
              expr
---------------------------------
 ((col ->> 'key'::text))::bigint   /* we used ((col ->> 'key'::text))::integer */
(1 row)

Everything works just as expected once we change the expression:

explain select * from test where ((col ->> 'key'::text))::bigint = 4;
                           QUERY PLAN
----------------------------------------------------------------
 Append  (cost=0.00..662.00 rows=50 width=32)
   ->  Seq Scan on test_1  (cost=0.00..662.00 rows=50 width=91)
         Filter: (((col ->> 'key'::text))::bigint = 4)
(3 rows)

Limitations

There are several limitations:

  • Expression must reference at least one column of partitioned table;
  • All referenced columns must be marked NOT NULL;
  • Expression may not reference system attributes (oid, xmin, xmax etc);
  • Expression may not include sub-queries;
  • All functions used by expression must be marked IMMUTABLE;
Clone this wiki locally