pg_hint_plan -- controls execution plan with hinting phrases in comment of special form.
PostgreSQL uses cost based optimizer, which utilizes data statistics, not static rules. The planner (optimizer) esitimates costs of each possible execution plans for a SQL statement then the execution plan with the lowest cost finally be executed. The planner does its best to select the best best execution plan, but not perfect, since it doesn't count some properties of the data, for example, correlation between columns.
pg_hint_plan makes it possible to tweak execution plans using so-called "hints", which are simple descriptions in the SQL comment of special form.
pg_hint_plan reads hinting phrases in a comment of special form given with the target SQL statement. The special form is beginning by the character sequence "/*+" and ends with "*/". Hint phrases are consists of hint name and following parameters enclosed by parentheses and delimited by spaces. Each hinting phrases can be delimited by new lines for readability.
In the example below , hash join is selected as the joning method and scanning pgbench_accounts by sequential scan method.
postgres=# /*+ postgres*# HashJoin(a b) postgres*# SeqScan(a) postgres*# */ postgres-# EXPLAIN SELECT * postgres-# FROM pgbench_branches b postgres-# JOIN pgbench_accounts a ON b.bid = a.bid postgres-# ORDER BY a.aid; QUERY PLAN --------------------------------------------------------------------------------------- Sort (cost=31465.84..31715.84 rows=100000 width=197) Sort Key: a.aid -> Hash Join (cost=1.02..4016.02 rows=100000 width=197) Hash Cond: (a.bid = b.bid) -> Seq Scan on pgbench_accounts a (cost=0.00..2640.00 rows=100000 width=97) -> Hash (cost=1.01..1.01 rows=1 width=100) -> Seq Scan on pgbench_branches b (cost=0.00..1.01 rows=1 width=100) (7 rows) postgres=#
Hinting phrases are classified into four types based on what kind of object they can affect. Scaning methods, join methods, joining order, row number correction and GUC setting. You will see the lists of hint phrases of each type in Hint list.
Scan method hints enforce the scanning method on the table specified as parameter. pg_hint_plan recognizes the target table by alias names if any. They are 'SeqScan' , 'IndexScan' and so on.
Scan hints are effective on ordinary tables, inheritance tables, UNLOGGED tables, temporary tables and system catalogs. It cannot be applicable on external(foreign) tables, table functions, VALUES command results, CTEs, Views and Sub-enquiries.
Join method hints enforces the join methods of the joins consists of tables specified as parameters.
Ordinary tables, inheritance tables, UNLOGGED tables, temporary tables, external (foreign) tables, system catalogs, table functions, VALUES command results and CTEs are allowed to be in the parameter list. But views and sub query are not.
Joining in specific order can be enforced using the "Leading" hint. The objects are joined in the order of the objects in the parameter list.
From the restriction of the planner's capability, it misestimates the number of results on some conditions. This type of hint corrects it.
'Set' hint changes GUC parameters just while planning. GUC parameter shown in Query Planning can have the expected effects on planning unless any other hint conflicts with the planner method configuration parameters. The last one among hints on the same GUC parameter makes effect. GUC parameters for pg_hint_plan are also settable by this hint but it won't work as your expectation. See Restrictions for details.
GUC parameters below affect the behavior of pg_hint_planpg_hint_plan.
Parameter name | discription | Default |
---|---|---|
pg_hint_plan.enable_hint | Enbles or disables the function of pg_hint_plan. | on |
pg_hint_plan.enable_hint_table | Enbles or disables table-based hinting. | on |
pg_hint_plan.debug_print | Enables and select the verbosity of the debug output of pg_hint_plan. Valid options are off, on, detailed and verbose. | off |
pg_hint_plan.message_level | Specifies the message level of debug prints. error, warning, notice, info, log, debug | info |
Simply run "make" in the top of the source tree, then "make install" as appropriate user. The PATH environment variable should be set properly for the target PostgreSQL for this process.
$ tar xzvf pg_hint_plan-1.x.x.tar.gz $ cd pg_hint_plan-1.x.x $ make $ su # make install
Basically pg_hint_plan does not requires CREATE EXTENSION. Simply loading it by LOAD command will activate it and of course you can load it globally by setting shared_preload_libraries in postgresql.conf. Or you might be interested in ALTER USER SET/ALTER DATABASE SET for automatic loading for specific sessions.
postgres=# LOAD 'pg_hint_plan'; LOAD postgres=#
Do CREATE EXTENSION and SET pg_hint_plan.enable_hint_tables TO on if you are planning to hint tables.
"make uninstall" in the top directory of source tree will uninstall the installed files if you installed from the source tree and it is left available.
$ cd pg_hint_plan-1.x.x $ su # make uninstall
Scan hints have basically has one parameter to specify the target object. This additional parameter for scans using indexes is preferable index name. The target object should be specified by its alias name if any. In the following example, table1 is scanned by sequential scan and table2 is scanned using the primary key index.
postgres=# /*+ postgres*# SeqScan(t1) postgres*# IndexScan(t2 t2_pkey) postgres*# */ postgres-# SELECT * FROM table1 t1 JOIN table table2 t2 ON (t1.key = t2.key);
Join hints have two or more objects which compose the join as parameters. If three objects are specified, the hint will be applied when joining any one of them after joining other two objects. In the following example, table1 and table2 are joined fisrt using nested loop and the result is joined against table3 using merge join.
postgres=# /*+ postgres*# NestLoop(t1 t2) postgres*# MergeJoin(t1 t2 t3) postgres*# Leading(t1 t2 t3) postgres*# */ postgres-# SELECT * FROM table1 t1 postgres-# JOIN table table2 t2 ON (t1.key = t2.key) postgres-# JOIN table table3 t3 ON (t2.key = t3.key);
Although there might be the case that table2 and 3 are joined first and table1 after that and the NestLoop hint won't be in effect after all. "Leading" hint enforces the joining order for the cases. The Leading hint in the above example enforces the joining order to table1, 2, 3 then both join method hints will be effective.
The above form of Leading hint enforces joining order but joining direction (inner/outer or driven/driving assignment) is left to the planner. If you want to also enforce joining directions, the second form of this hint will help.
postgres=# /*+ Leading((t1 (t2 t3))) */ SELECT...
Every pair of parentheses enclose two elements which are an object or nested parentheses. The first element in a pair of parentheses is the driver or outer table and the second is the driven or inner.
Planner misestimates the number of the records for joins on some condition. This hint can corrects the number by several methods, which are absolute value, addition/subtraction and multiplication. The parameters are the list of objects compose the targetted join then operation. The following example shows notations to correct the number of the join on a and b by the four correction methods.
postgres=# /*+ Rows(a b #10) */ SELECT... ; Sets rows of join result to 10 postgres=# /*+ Rows(a b +10) */ SELECT... ; Increments row number by 10 postgres=# /*+ Rows(a b -10) */ SELECT... ; Subtracts 10 from the row number. postgres=# /*+ Rows(a b *10) */ SELECT... ; Makes the number 10 times larger.
"Set" hint sets GUC parameter values during the target statement is under plannning. In the following example, planning for the query is done with random_page_cost is 2.0.
postgres=# /*+ postgres*# Set(random_page_cost 2.0) postgres*# */ postgres-# SELECT * FROM table1 t1 WHERE key = 'value'; ...
postgres=# /*+ postgres*# HashJoin(a b) postgres*# SeqScan(a) postgres*# */ postgres-# /*+ IndexScan(a) */ postgres-# EXPLAIN SELECT /*+ MergeJoin(a b) */ * postgres-# FROM pgbench_branches b postgres-# JOIN pgbench_accounts a ON b.bid = a.bid postgres-# ORDER BY a.aid; QUERY PLAN --------------------------------------------------------------------------------------- Sort (cost=31465.84..31715.84 rows=100000 width=197) Sort Key: a.aid -> Hash Join (cost=1.02..4016.02 rows=100000 width=197) Hash Cond: (a.bid = b.bid) -> Seq Scan on pgbench_accounts a (cost=0.00..2640.00 rows=100000 width=97) -> Hash (cost=1.01..1.01 rows=1 width=100) -> Seq Scan on pgbench_branches b (cost=0.00..1.01 rows=1 width=100) (7 rows) postgres=#
postgres=# CREATE FUNCTION hints_func(integer) RETURNS integer AS $$ postgres$# DECLARE postgres$# id integer; postgres$# cnt integer; postgres$# BEGIN postgres$# SELECT /*+ NoIndexScan(a) */ aid postgres$# INTO id FROM pgbench_accounts a WHERE aid = $1; postgres$# SELECT /*+ SeqScan(a) */ count(*) postgres$# INTO cnt FROM pgbench_accounts a; postgres$# RETURN id + cnt; postgres$# END; postgres$# $$ LANGUAGE plpgsql;
postgres=# /*+ HashJoin(t1 t1)*/ postgres-# EXPLAIN SELECT * FROM s1.t1 postgres-# JOIN public.t1 ON (s1.t1.id=public.t1.id); INFO: hint syntax error at or near "HashJoin(t1 t1)" DETAIL: Relation name "t1" is ambiguous. QUERY PLAN ------------------------------------------------------------------ Merge Join (cost=337.49..781.49 rows=28800 width=8) Merge Cond: (s1.t1.id = public.t1.id) -> Sort (cost=168.75..174.75 rows=2400 width=4) Sort Key: s1.t1.id -> Seq Scan on t1 (cost=0.00..34.00 rows=2400 width=4) -> Sort (cost=168.75..174.75 rows=2400 width=4) Sort Key: public.t1.id -> Seq Scan on t1 (cost=0.00..34.00 rows=2400 width=4) (8 行) postgres=# /*+ HashJoin(pt st) */ postgres-# EXPLAIN SELECT * FROM s1.t1 st postgres-# JOIN public.t1 pt ON (st.id=pt.id); QUERY PLAN --------------------------------------------------------------------- Hash Join (cost=64.00..1112.00 rows=28800 width=8) Hash Cond: (st.id = pt.id) -> Seq Scan on t1 st (cost=0.00..34.00 rows=2400 width=4) -> Hash (cost=34.00..34.00 rows=2400 width=4) -> Seq Scan on t1 pt (cost=0.00..34.00 rows=2400 width=4) (5 行)
postgres=# CREATE VIEW v1 AS SELECT * FROM t2; postgres=# EXPLAIN /*+ HashJoin(t1 v1) */ SELECT * FROM t1 JOIN v1 ON (c1.a = v1.a); QUERY PLAN ------------------------------------------------------------------ Hash Join (cost=3.27..18181.67 rows=101 width=8) Hash Cond: (t1.a = t2.a) -> Seq Scan on t1 (cost=0.00..14427.01 rows=1000101 width=4) -> Hash (cost=2.01..2.01 rows=101 width=4) -> Seq Scan on t2 (cost=0.00..2.01 rows=101 width=4)
postgres=# CREATE VIEW view1 AS SELECT * FROM table1 t1; CREATE TABLE postgres=# /*+ SeqScan(t1) */ postgres=# EXPLAIN SELECT * FROM table1 t1 JOIN view1 t2 ON (t1.key = t2.key) WHERE t2.key = 1; QUERY PLAN ----------------------------------------------------------------- Nested Loop (cost=0.00..358.01 rows=1 width=16) -> Seq Scan on table1 t1 (cost=0.00..179.00 rows=1 width=8) Filter: (key = 1) -> Seq Scan on table1 t1 (cost=0.00..179.00 rows=1 width=8) Filter: (key = 1) (5 rows) postgres=# /*+ SeqScan(t3) */ postgres=# EXPLAIN SELECT * FROM table1 t3 JOIN view1 t2 ON (t1.key = t2.key) WHERE t2.key = 1; QUERY PLAN -------------------------------------------------------------------------------- Nested Loop (cost=0.00..187.29 rows=1 width=16) -> Seq Scan on table1 t3 (cost=0.00..179.00 rows=1 width=8) Filter: (key = 1) -> Index Scan using foo_pkey on table1 t1 (cost=0.00..8.28 rows=1 width=8) Index Cond: (key = 1) (5 rows)
IN (SELECT ... {LIMIT | OFFSET ...} ...) = ANY (SELECT ... {LIMIT | OFFSET ...} ...) = SOME (SELECT ... {LIMIT | OFFSET ...} ...)
For these syntaxes, planner internally assigns the name of "ANY_subquery" to the subquery when planning joins including it, so join hints are applicable on such joins using the implicit name.
postgres=# /*+HashJoin(a1 ANY_subquery)*/ postgres=# EXPLAIN SELECT * postgres=# FROM pgbench_accounts a1 postgres=# WHERE aid IN (SELECT bid FROM pgbench_accounts a2 LIMIT 10); QUERY PLAN --------------------------------------------------------------------------------------------- Hash Semi Join (cost=0.49..2903.00 rows=1 width=97) Hash Cond: (a1.aid = a2.bid) -> Seq Scan on pgbench_accounts a1 (cost=0.00..2640.00 rows=100000 width=97) -> Hash (cost=0.36..0.36 rows=10 width=4) -> Limit (cost=0.00..0.26 rows=10 width=4) -> Seq Scan on pgbench_accounts a2 (cost=0.00..2640.00 rows=100000 width=4) (6 rows)