Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that RestCatalog passes user config to FileIO #476

Merged
merged 2 commits into from
Aug 20, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 16 additions & 2 deletions crates/catalog/rest/src/catalog.rs
Original file line number Diff line number Diff line change
Expand Up @@ -504,8 +504,15 @@ impl Catalog for RestCatalog {
.query::<LoadTableResponse, ErrorResponse, OK>(request)
.await?;

let config = resp
.config
.unwrap_or_default()
.into_iter()
.chain(self.user_config.props.clone().into_iter())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if it was that way around, you could not use the local user_config to override a property that came from the REST service? I think we would want the ability to do that. I think it makes more sense the way around that I have it.

Copy link
Contributor

@liurenjie1024 liurenjie1024 Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both two methods have pros and cons.

If we prioritize user config, we get max flexibility, but also would be easy to introduce bugs. For example, what if user misconfiged some fileio property which should be managed by rest catalog?

Also, the rest catalog spec 's statement make me feel that we should repsect rest catalog's response.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My rationale behind this was so that users could override values from the REST config. For instance if you wanted to try out using an S3 proxy / cache by overriding the value of the S3 warehouse endpoint. I can change it if you insist, but that would reduce the usefulness of this change in my eyes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a reasonable case for me. cc @Xuanwo @Fokko what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong preference for this but respecting the user's override seems sensible to me. We can mention its behavior in the API comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some doc comments to the two relevant methods to clarify the behaviour. Hope this helps.

.collect();

let file_io = self
.load_file_io(resp.metadata_location.as_deref(), resp.config)
.load_file_io(resp.metadata_location.as_deref(), Some(config))
.await?;

let table = Table::builder()
Expand Down Expand Up @@ -542,8 +549,15 @@ impl Catalog for RestCatalog {
.query::<LoadTableResponse, ErrorResponse, OK>(request)
.await?;

let config = resp
.config
.unwrap_or_default()
.into_iter()
.chain(self.user_config.props.clone().into_iter())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

.collect();

let file_io = self
.load_file_io(resp.metadata_location.as_deref(), resp.config)
.load_file_io(resp.metadata_location.as_deref(), Some(config))
.await?;

let table_builder = Table::builder()
Expand Down
Loading