-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Next release of pg_hint_plan with PostgreSQL 17 compatibility #190
Comments
Looking forward to trying out new changes! I'm sure @FranckPachot would be interested as well. |
Hi Michael-san!
3. Feature: plug-in query IDs rather than query string normalizations for
the hint table. This has been on the table for some time, and I really want
to get rid of that entirely now that query IDs are compiled in core,
included for utilities.
Does it mean the query string normalization column on the hint table will
be removed and
the query_id column will be added on the table instead?
The release would be planned for September, potentially in its 2nd week but
let's be flexible about that, just before PostgreSQL GA. In terms of
features, I think that I'll be able to tackle item 3, not sure about 4.
If @yamatattsu <https://github.com/yamatattsu> , @rjuju
<https://github.com/rjuju> or any others have any comments, feel free.
For now, The schedule seems fine for me.
I'd like to add this item to the TODO list:
6. Add new Japanese translation file (*.po) for the document
Message ID: ***@***.***>
… |
#3 is a really important feature IMO as the current hint_table is hard to use when dealing with large sql text or the same sql text that could have differences in comments and white space. Just a thought on this feature: I do think there needs to be a column that tracks the last time the hint was used. IMO, there are existing reasons a last_used is important to have, but especially with queryId changing if a table is dropped and recreated, pg_dump, etc., the user will need to perform maintenance on the table and remove no longer valid queryIds. Of course there is an overhead to continually updating a last_used, and I don't think that accuracy is required here ( maybe update last_used every so often ). Regards, Sami Imseih |
Cant wait for the query_id enhancement. thx |
Yes, it means to wipe out entirely this dependency from the tree. It would still be possible to cross-check the normalized string with pg_stat_statements, which is something that most users use anyway these days for monitoring. |
Hi Michael-san and Sami,
Yes, it means to wipe out entirely this dependency from the tree. It would
still be possible to cross-check the normalized string with
pg_stat_statements, which is something that most users use anyway these
days for monitoring.
Ah, I see.
In other words, pg_stat_statements is required as a prerequisite for using
the hint table of
pg_hint_plan. Right?
#3 <#3> is a really important
feature IMO as the current hint_table is hard to use when dealing with
large sql text or the same sql text that could have differences in comments
and white space.
As an extra information, pg_plan_advsr automatically generates normalized
query strings and
automatically registers them in the hint table for use.
I don't understand the issue with normalized query strings in detail, but
if you are having trouble with it,
you might want to try pg_plan_advsr.
…On Thu, May 2, 2024 at 6:57 AM Michael Paquier ***@***.***> wrote:
Does it mean the query string normalization column on the hint table will
be removed and the query_id column will be added on the table instead?
Yes, it means to wipe out entirely this dependency from the tree. It would
still be possible to cross-check the normalized string with
pg_stat_statements, which is something that most users use anyway these
days for monitoring.
—
Reply to this email directly, view it on GitHub
<#190 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADSHQG2VEVZNHSZFDQKYB7DZAFQMLAVCNFSM6AAAAABHBEXSC6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOBZGE4TQMRUGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Duplication. The normalization code in pg_hint_plan is a copy-paste of pg_stat_statements. I can see what you are getting at with pg_plan_advsr as it relies on pg_hint_plan's normalize_query.h. Still it looks like the same thing to me. Is there any point in normalizing the query if we have a query ID. Perhaps that's an overoptimistic assumption, but we can assume that pg_stat_statements is used, meaning that the query would be normalized and that it would be able to retrieve that with a join? @rjuju , thoughts? |
well, there's an assumption that some normalisation stuff is installed, as now pg_stat_statements can work with custom algorithm. so in theory if your use case is to add hints on queries touching temp tables because you don't issue vacuum and you know that a certain plan is better it could be a good idea to rely on a custom jumbling approach that computes a query identifier based on the relation names for temp tables so that you can actually have a stable queryid. |
pg_hint_plan uses the normalized query to do lookups of the hint table, using the same normalization as pg_stat_statements where the comments don't matter. There's a point that this reduces debugging if relying only on the normalized string if looking at the hint table, but honestly I'm going to hang on my argument about "let's just use pg_stat_statements" for the mapping and do JOINs, reducing the bloat of the hint table if using long query strings. We're paying twice the normalization costs, currently. |
my point was more with the curent plan of relying on the queryid only. if we do that there's no need for the query string or normalised query string right? so there would be no need for pgss either, just enable core queryid computation or install a custom one that better suits your needs. |
I can see what you are getting at with pg_plan_advsr as it relies on
pg_hint_plan's normalize_query.h. Still it looks like the same thing to me.
Is there any point in normalizing the query if we have a query ID. Perhaps
that's an overoptimistic assumption, but we can assume that
pg_stat_statements is used, meaning that the query would be normalized and
that it would be able to retrieve that with a join?
Ah, right, pg_plan_advsr uses pg_hint_plan's normalize_query.h. I thought
the problem was that it was difficult to create a normalized query string,
so I proposed to use pg_plan_advsr. But I understand that it is not the
problem on this thread.
I'm going to hang on my argument about "let's just use pg_stat_statements"
for the mapping and do JOINs, reducing the bloat of the hint table if using
long query strings. We're paying twice the normalization costs, currently.
The problem here is two things like:
1. reducing the bloat of the hint table by long query text
2. reducing the costs of normalization string
Is my understanding right?
If so, using queryid based on compute_query_id on PG core seems better to
solve the problems.
P.S.
pg_plan_advsr uses queryid by using compute_query_id since PG14, so I guess
it's okay to use it in the hint table.
|
Here is a repro of an example that I have seen on the field a few times. In the below case, my hints table includes the following query text
However, an application might issue a variant ( or variants ) of the query text in which a comment is added.
In this case, the hints table will only work for the uncommented version of the query text, even though they are semantically similar queries.
This is just on real-world case in which having a queryId will be very useful. There are other cases such as adding whitespace or a linebreak in the query text. All such queries will end up with the same queryId, but the hints table currently is not able to recognize this.
Regards, Sami |
From the point where the hint table requires the values to be inserted, providing a 64b value is just going to be less error-prone because there are less contents to manipulate.. |
Thanks. As far as I can see, there are no major objections about that. I will send a patch that does the switch in the hint table then, tweaking the docs that query ID values inserted in the hint table to match with the entries in pg_stat_statements. |
Thanks. As far as I can see, there are no major objections about that. I will send a patch that does the switch in the hint table then,
|
I have begun my butcher work for this release, and began with two things:
|
/*+ APPLICATION 2 */ SELECT * FROM t1 WHERE t1.id = ?;
postgres=# SELECT *
postgres-# FROM t1 WHERE t1.id = 1;
Thanks for the clarification! I understand the problems.
|
If you have any questions, we can talk about that on Friday in live. |
If you have any questions, we can talk about that on Friday in live.
Sounds good! Thanks.
|
What is the status in here? v17 beta2 is already out. |
Still need a bit of time, but I am towards a release by the end of August with the core changes pushed by the end of next week, before brushing what I can from the bug list. I don't have more plans for in-core changes with the module for this release, only bug fixes. |
Looking at my calendar, I am proposing the 29th of August for the tarball release. @devrimgunduz |
The branches have been structured and prepared for the upcoming release, and a new PG17 branch has been created with a github action job for automated tests. At this point it would be only a matter of tagging and creating the tarballs. Now is time for the real action: clean up of the existing bug list and address issues. |
So, ping :-) |
Yep. I'm aware of that. But my schedule does not give me the possibility to get back to that until the 17th of August. So. |
Hi Michael-san,
I finished updating .po files for the next release on PG17 branch.
7757374
Thanks,
Tatsuro Yamada
|
Thanks Yamada-san! |
Thanks.
When the release page is created, @shinyaaa will upload the RPMs for RHEL 8/9. |
Here is an update. I have most of the things I wanted to tackle for the moment in the tree, remains:
Hence in all that, I just have #191 that should get its backpatch. |
The tags have been pushed, the release notes have been written, the release has been pushed to PGXN and the release announcement has been sent to pgsql-announce, currently being hold for moderation. So my job here is done. @shinyaaa, could you upload the RPMs, please? |
@michaelpq Thank you! I've uploaded the RPMs. |
Cool, issue closed. |
Hi all,
I have that on my tablets for some time. Let's do a new release of this module at the same time as PostgreSQL 17. I have on my TODO list a few things, as of:
The release would be planned for September, potentially in its 2nd week but let's be flexible about that, just before PostgreSQL GA. In terms of features, I think that I'll be able to tackle item 3, not sure about 4.
If @yamatattsu , @rjuju or any others have any comments, feel free.
The text was updated successfully, but these errors were encountered: