Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hint table doesn't seem to have effect on query plans #183

Closed
avandras opened this issue Apr 16, 2024 · 7 comments
Closed

Hint table doesn't seem to have effect on query plans #183

avandras opened this issue Apr 16, 2024 · 7 comments

Comments

@avandras
Copy link

I've got the following table:

test=# \d t_sample
                          Table "test.t_sample"
   Column    |           Type           | Collation | Nullable | Default 
-------------+--------------------------+-----------+----------+---------
 tstamp      | timestamp with time zone |           |          | 
 sensor_id   | integer                  |           |          | 
 measurement | numeric                  |           |          | 
Indexes:
    "t_sample_sensor_id_idx" btree (sensor_id)
    "t_sample_sensor_id_measurement_idx" btree (sensor_id, measurement)

pg_hint_plan is set up like this:

test=# SELECT version();
                                                              version                                                              
-----------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 15.6 (Ubuntu 15.6-1.pgdg22.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit

test=# \dx pg_hint_plan 
           List of installed extensions
     Name     | Version |  Schema   | Description 
--------------+---------+-----------+-------------
 pg_hint_plan | 1.5.1   | hint_plan | 

test=# SELECT name, setting FROM pg_settings WHERE name LIKE '%pg_hint%';
              name              | setting 
--------------------------------+---------
 pg_hint_plan.debug_print       | on
 pg_hint_plan.enable_hint       | on
 pg_hint_plan.enable_hint_table | on
 pg_hint_plan.hints_anywhere    | off
 pg_hint_plan.message_level     | debug1
 pg_hint_plan.parse_messages    | info
(6 rows)

With these settings, I can do the following:

test=# EXPLAIN  /*+ SeqScan(s) */ SELECT * FROM t_sample AS s WHERE sensor_id = 4;
DEBUG:  pg_hint_plan:
used hint:
SeqScan(s)
not used hint:
duplication hint:
error hint:

                             QUERY PLAN                              
---------------------------------------------------------------------
 Seq Scan on t_sample s  (cost=0.00..815550.25 rows=473761 width=23)
   Filter: (sensor_id = 4)
(2 rows)

So far, so good.

Now I inserted the following into the hint table:

test=# TABLE hint_plan.hints;
 id |                norm_query_string                | application_name |   hints    
----+-------------------------------------------------+------------------+------------
  3 | SELECT * FROM t_sample AS s WHERE sensor_id = ? |                  | SeqScan(s)
(1 row)

As far as I can tell the norm query string matches my query, but there is still no hint being applied:

test=# EXPLAIN  SELECT * FROM t_sample AS s WHERE sensor_id = 4;
                                         QUERY PLAN                                          
---------------------------------------------------------------------------------------------
 Bitmap Heap Scan on t_sample s  (cost=5276.09..760738.78 rows=473761 width=23)
   Recheck Cond: (sensor_id = 4)
   ->  Bitmap Index Scan on t_sample_sensor_id_idx  (cost=0.00..5157.65 rows=473761 width=0)
         Index Cond: (sensor_id = 4)
(4 rows)

What am I doing wrong? And how to debug this further?

@avandras
Copy link
Author

FWIW, upon a hint that the EXPLAIN might result in a different query, I've enabled auto_explain and still see no involvement from the hint table:

SET auto_explain.log_min_duration = 0;

2024-04-16 15:30:14.910 CEST,"postgres","test",753717,"[local]",661d60e9.b8035,40,"SELECT",2024-04-15 19:16:25 CEST,4/1094,0,LOG,00000,"duration: 1558.567 ms  plan:
Query Text: SELECT * FROM t_sample AS s WHERE sensor_id = 4;
Bitmap Heap Scan on t_sample s  (cost=5276.09..760738.78 rows=473761 width=23)
  Recheck Cond: (sensor_id = 4)
  ->  Bitmap Index Scan on t_sample_sensor_id_idx  (cost=0.00..5157.65 rows=473761 width=0)
        Index Cond: (sensor_id = 4)",,,,,,,,,"psql","client backend",,-6175464486641872830

@avandras
Copy link
Author

Further analyzing this, I see the hint table is visited:

2024-04-16 16:28:50.942 CEST,"postgres","test",753717,"[local]",661d60e9.b8035,55,"SELECT",2024-04-15 19:16:25 CEST,4/1102,0,LOG,00000,"duration: 0.022 ms  plan:
Query Text: SELECT hints   FROM hint_plan.hints  WHERE norm_query_string = $1    AND ( application_name = $2     OR application_name = '' )  ORDER BY application_name DESC
Sort  (cost=11.31..11.31 rows=1 width=64) (actual time=0.021..0.022 rows=0 loops=1)
  Sort Key: application_name DESC
  Sort Method: quicksort  Memory: 25kB
  ->  Bitmap Heap Scan on hints  (cost=4.17..11.30 rows=1 width=64) (actual time=0.018..0.018 rows=0 loops=1)
        Recheck Cond: (norm_query_string = $1)
        Filter: ((application_name = $2) OR (application_name = ''::text))
        ->  Bitmap Index Scan on hints_norm_and_app  (cost=0.00..4.17 rows=3 width=0) (actual time=0.016..0.017 rows=0 loops=1)
              Index Cond: (norm_query_string = $1)",,,,,"SQL statement ""SELECT hints   FROM hint_plan.hints  WHERE norm_query_string = $1    AND ( application_name = $2     OR application_name = '' )  ORDER BY application_name DESC""",,,,"psql","client backend",,-6175464486641872830
2024-04-16 16:28:50.960 CEST,"postgres","test",753717,"[local]",661d60e9.b8035,56,"SELECT",2024-04-15 19:16:25 CEST,4/1102,0,LOG,00000,"duration: 17.125 ms  plan:
Query Text: SELECT * FROM t_sample AS s WHERE sensor_id = 4;
Bitmap Heap Scan on t_sample s  (cost=1653.55..50483.23 rows=15112 width=23) (actual time=5.504..13.173 rows=14401 loops=1)
  Recheck Cond: (sensor_id = 4)
  Heap Blocks: exact=14019
  ->  Bitmap Index Scan on t_sample_sensor_id_idx  (cost=0.00..1649.77 rows=15112 width=0) (actual time=3.899..3.900 rows=14401 loops=1)
        Index Cond: (sensor_id = 4)",,,,,,,,,"psql","client backend",,-6175464486641872830

but there is no row found matching my query.

My guess right now is that I don't fully understand how to come up with the norm query.

@avandras
Copy link
Author

avandras commented Apr 16, 2024

So, after further experimenting, we've found the following:

  • one has to include the trailing semicolon in the norm query string
  • whitespace is not handled as I expected: I had to remove one of the double spaces directly after EXPLAIN above
  • comments have to be retained as well

Based on all that, I think the query normalization approach has to be changed. It would be rather helpful if query strings from pg_stat_statements (or similar) could be used just with replacing positional parameters with question marks.

@df7cb
Copy link

df7cb commented Apr 16, 2024

There is a query id mechanism that got spun off pg_stat_statements and is now usable by other subsystems, I'd think pg_hint_plan should use that as well.

@fakdaddy75
Copy link

Had/have the same issue.
This gets almost impossible to use with complex wrapped queries.
I agree with above statement that query_id should be all you need to pass to a function to load the hint table. Or have someway to trace the postgres $1 value for what text its searching in the statement

SELECT hints FROM hint_plan.hints WHERE norm_query_string = $1 AND ( application_name = $2 OR application_name = '' ) ORDER BY application_name DESC

@michaelpq
Copy link
Collaborator

There is a query id mechanism that got spun off pg_stat_statements and is now usable by other subsystems, I'd think pg_hint_plan should use that as well.

Yeah, we really need to rip the query normalization entirely off pg_hint_plan and grab all the information we need from the query IDs. I've not been able to get down to do it, though. So if somebody would like to write a patch I could check, that would be welcome.

@michaelpq
Copy link
Collaborator

This stuff will be addressed by #190 as an item for the next release, and will not be backpatched to stable branches, so closing this issue for now to avoid a duplicate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants