Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QueryCtx enhancement (Config + Session property) (V1 deprecated) #6846

Closed
kewang1024 opened this issue Oct 2, 2023 · 13 comments
Closed

QueryCtx enhancement (Config + Session property) (V1 deprecated) #6846

kewang1024 opened this issue Oct 2, 2023 · 13 comments
Labels
enhancement New feature or request

Comments

@kewang1024
Copy link
Contributor

kewang1024 commented Oct 2, 2023

I had some discussion with @xiaoxmeng about the current velox config system and we're thinking of an enhancement:

Current problem

image

As of now, prestissimo wraps session properties into QueryCtx and pass it to velox for each query.
For connector configs read from config file
-they're passed in separately
-with duplicate configs.
For system config from config file, they're not passed in at all.

Such design makes the API complicated and confusing. We already saw misusage like passing a non-existing config.

Proposed solution

In the proposed solution, we will handle the translation logic in prestissimo layer and provide a clean API for velox to use.
image

Merge logic details

In the merge logic, it targets to:

image

Let me know what you guys think?
@mbasmanova, @majetideepak, @aditi-pandit, @amitkdutta, @spershin

@kewang1024 kewang1024 added the enhancement New feature or request label Oct 2, 2023
@majetideepak
Copy link
Collaborator

majetideepak commented Oct 2, 2023

@kewang1024 It definitely makes sense for Velox to see a single config entry for each use.
Do you know how Presto Java merges both connectorConfig(from session) and connectorProperties(from file)?
I also noticed that Velox tests do not have an easy way to set QueryCtx config. We need a way to set those here https://github.com/facebookincubator/velox/blob/main/velox/exec/tests/utils/Cursor.cpp#L124

@kewang1024
Copy link
Contributor Author

kewang1024 commented Oct 2, 2023

@kewang1024 It definitely makes sense for Velox to see a single config entry for each use. Do you know how Presto Java merges both connectorConfig(from session) and connectorProperties(from file)? I also noticed that Velox tests do not have an easy way to set QueryCtx config. We need a way to set those here https://github.com/facebookincubator/velox/blob/main/velox/exec/tests/utils/Cursor.cpp#L124

So java doesn't not do merge technically, but session property's default value comes from config and would override the config value, so in some sense, it still merges.

Thanks for the code pointer, I will keep this one in mind when I implement the above enhancement.

@tdcmeehan
Copy link

tdcmeehan commented Oct 2, 2023

The following questions are based on my understanding that the majority of work and design is related to Presto (if so, should a corresponding Presto issue be opened):

If Java doesn't do the merge, somehow we would need to configure the coordinator not to validate these special native_ session properties. Any idea how to go about that?

Additionally, would these session properties be returned in the output of SHOW SESSION?

@kewang1024
Copy link
Contributor Author

If Java doesn't do the merge, somehow we would need to configure the coordinator not to validate these special native_ session properties. Any idea how to go about that?

Additionally, would these session properties be returned in the output of SHOW SESSION?

Yes, it would be returned in the output.

As of now, we already have several native only session property, it's just not being used by Java.

@mbasmanova
Copy link
Contributor

@kewang1024 I find it difficult to follow this discussion. Would it be possible to illustrate the problem and a proposed solution with some examples?

@mbasmanova
Copy link
Contributor

My understanding is that some configuration properties cannot be modified at a query level and have to be fixed for the lifetime of the process. Do we have a proposal for how to handle these?

@aditi-pandit
Copy link
Collaborator

Have bunch of questions also

  • Many Velox properties like
async-data-cache-enabled

async-data-cache-ssd-gb
async-data-cache-ssd-checkpoint-gb
async-cache-ssd-path
async-cache-ssd-disable-file-cow

are used to configure global Velox structures and not changed at query time. Its not clear how your proposal handles these.

  • There are multiple properties that were common between Java worker and Native worker. e.g. join_spill_memory_threshold. In PR [Native] Trim native prefix for system session property prestodb/presto#21023 these are renamed with native prefix. At the SHOW SESSION level how are we separating these ? It would be confusing for a Prestissimo user to see a similar session property of the java worker. Its better to show only native system session properties to the Prestissimo clients.

  • What is the purpose of conversion map in the diagram ? If we are separating native system session properties like in PR 21023, then why is this needed ?

@vermapratyush
Copy link
Contributor

Regarding connector config.
How are the configs merged when 2 connectors of the same type is registered?
The properties from both the connectors will be conflicting with one another.

@kewang1024
Copy link
Contributor Author

@kewang1024 I find it difficult to follow this discussion. Would it be possible to illustrate the problem and a proposed solution with some examples?

Sure, one example is that we can have both config and session property passed in, but we actually only read one of it and ignore the other:
orc-compression-codec
orc_compression_codec

@kewang1024
Copy link
Contributor Author

Have bunch of questions also

  • Many Velox properties like
async-data-cache-enabled

async-data-cache-ssd-gb
async-data-cache-ssd-checkpoint-gb
async-cache-ssd-path
async-cache-ssd-disable-file-cow

are used to configure global Velox structures and not changed at query time. Its not clear how your proposal handles these.

  • There are multiple properties that were common between Java worker and Native worker. e.g. join_spill_memory_threshold. In PR [Native] Trim native prefix for system session property prestodb/presto#21023 these are renamed with native prefix. At the SHOW SESSION level how are we separating these ? It would be confusing for a Prestissimo user to see a similar session property of the java worker. Its better to show only native system session properties to the Prestissimo clients.
  • What is the purpose of conversion map in the diagram ? If we are separating native system session properties like in PR 21023, then why is this needed ?
  1. My current plan hasn't included changes for static config for now, we can do a brainstorming and refactoring for them as well
  2. I agree, if most people found it confusing to have both session property, then we can think about only showing native system session properties for Prestissimo. Thanks for the good suggestion.
  3. The purpose of conversion map is not for normal session property and native session property, it aims to provide mapping between config and session property
    eg:
    orc-compression-codec
    orc_compression_codec
    we need to be able to understand they're the same, and the value from session property should override the value from config.

@kewang1024
Copy link
Contributor Author

Regarding connector config. How are the configs merged when 2 connectors of the same type is registered? The properties from both the connectors will be conflicting with one another.

I believe they would have different catalog names of same catalog type

@majetideepak
Copy link
Collaborator

Only those variables that have both global and session scope will be merged.

@kewang1024
Copy link
Contributor Author

Version 2 of this enhancement: #7659

@kewang1024 kewang1024 changed the title QueryCtx enhancement (Config + Session property) QueryCtx enhancement (Config + Session property) (V1 deprecated) Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants