Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(flight): add helpers to handle CommandGetCatalogs, CommandGetSchemas, and CommandGetTables requests #4296

Merged
merged 20 commits into from
Jun 1, 2023

Conversation

roeap
Copy link
Contributor

@roeap roeap commented May 27, 2023

Which issue does this PR close?

Closes #4295

Rationale for this change

see #4295

What changes are included in this PR?

The main builder implementations were lifted from IOx, but altered to also include sorting the created record batches.

cc @alamb @avantgardnerio

Are there any user-facing changes?

Users can use the added Builders and methods to create responses to CommandGetCatalogs, CommandGetSchemas, and CommandGetTables requests.

@github-actions github-actions bot added arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate labels May 27, 2023
@alamb
Copy link
Contributor

alamb commented May 28, 2023

Thanks @roeap -- I will review this carefully, but I probably won't have a chance until tomorrow or Tuesday

@alamb
Copy link
Contributor

alamb commented May 28, 2023

FYI @appletreeisyellow

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @roeap -- I think this looks great!

I had some small API suggestions, but I don't think any are required (or I may propose them as a follow on PR, depending on your feedback).

Really appreciated 🙏

arrow-flight/Cargo.toml Show resolved Hide resolved
arrow-flight/examples/flight_sql_server.rs Outdated Show resolved Hide resolved
_request: Request<Ticket>,
) -> Result<Response<<Self as FlightService>::DoGetStream>, Status> {
Err(Status::unimplemented("do_get_tables not implemented"))
let tables = TABLES
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it is worth a comment that this does not implement full SQL style multipart identifier semantics to warn anyone who tries to use this code as a starting point for implementing FlightSQL? Maybe that will be obvious to anyone who uses this code, but we fought for quite a while to get consistent identifier semantics in DataFusion (one needs to parse it via SQL)

.collect::<HashSet<_>>();

let mut builder = GetSchemasBuilder::new(query.db_schema_filter_pattern);
if let Some(catalog) = query.catalog {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the equality matching on catalog should be the same for all implementations, what do you think about putting it in the builder itself?

The same comment applies to the GetTablesBuilder too

We can do this as a follow on PR too (or never)

Something like:

        let mut builder = GetSchemasBuilder::new(query.db_schema_filter_pattern);
        if let Some(catalog) = query.catalog {
          builder = builder.with_catalog_filter(catalog)
        }
        for (catalog_name, schema_name) in schemas {
              builder
                  .append(catalog_name, schema_name)
                  .map_err(Status::from)?;
        }

Copy link
Contributor Author

@roeap roeap May 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb, since we are then using all the fields on the command object, should we just convert one into the other?

Someting like given query is CommandGetDbSchemas.

let mut builder: GetSchemasBuilder = query.builder();

implementing from, with maybe the downside, that you would always actually have to cast the type..

let mut builder: GetSchemasBuilder = query.into();

arrow-flight/src/sql/catalogs/db_schemas.rs Outdated Show resolved Hide resolved
impl GetTablesBuilder {
/// Create a new instance of [`GetTablesBuilder`]
///
/// The builder handles filtering by schema and table patterns, the caller
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned elsewhere I think it might be worth encapsulating all filtering in the Builders to make the API easier to use

arrow-flight/src/sql/catalogs/tables.rs Outdated Show resolved Hide resolved
arrow-flight/src/sql/catalogs/tables.rs Show resolved Hide resolved
arrow-flight/src/sql/catalogs/tables.rs Outdated Show resolved Hide resolved
arrow-flight/src/sql/catalogs/db_schemas.rs Outdated Show resolved Hide resolved
roeap and others added 7 commits May 30, 2023 22:33
* Improve docs and tests for SqlInfoList

* Add an example/

* Update arrow-flight/src/sql/sql_info.rs

Co-authored-by: Liang-Chi Hsieh <[email protected]>

---------

Co-authored-by: Liang-Chi Hsieh <[email protected]>
* Improve docs and tests for SqlInfoList

* Add an example/

* Update arrow-flight/src/sql/sql_info.rs

Co-authored-by: Liang-Chi Hsieh <[email protected]>

---------

Co-authored-by: Liang-Chi Hsieh <[email protected]>
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the liberty of fixing cargo fmt and merging up from the master branch.

@roeap shall we merge this PR as is, or do you plan to make any changes?

@roeap
Copy link
Contributor Author

roeap commented May 31, 2023

I wanted to follow up on your suggestions- integrating catalog filters etc.

I'll leave it up to you if that should be here or in a followup 🙂

@alamb
Copy link
Contributor

alamb commented May 31, 2023

I'll leave it up to you if that should be here or in a followup 🙂

If you would like to do so, let's do it here so we don't accidentally release an API that is undergoing changes (I expect @tustvold will likely make the next release candidate on Friday, in 2 days. Thus if we merged this PR we would have a self imposed deadline of those 2 days to get subsequent changes in)

@roeap
Copy link
Contributor Author

roeap commented May 31, 2023

Should be feasible to get it done by then.

Also have a first draft for the xdbc stuff available- stretch goal .. 🙂

@alamb
Copy link
Contributor

alamb commented May 31, 2023

Thank you so much @roeap !

Copy link
Contributor Author

@roeap roeap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb - Added the additional filtering. This also made it quite trivial to convert the request directly into the builder, which makes for quite a convenient API - I think. So worthwhile eeffort in my opinion :).

mod tables;

/// Returns the RecordBatch for
pub fn get_catalogs_batch(mut catalog_names: Vec<String>) -> Result<RecordBatch> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick builder implementation - consistency is always nice ...

/// Append a row
pub fn append(
&mut self,
catalog_name: impl AsRef<str>,
Copy link
Contributor Author

@roeap roeap May 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not completely sure how this would be expected. The filter distinguishes None which means no filtering vs "" which means empty catalog. So in order to specify an empty catalog, one would pass in "". Thought about making this optional, and then testing for nulls in the filter, but in the end I felt its trivial to default to empty string for the caller, so handling nulls in the catalog array is not necessary.

.collect::<std::result::Result<Vec<_>, _>>()?
.into_iter()
// We know the arrays are of same length as they are produced fromn the same root array
.reduce(|filter, arr| or(&filter, &arr).unwrap());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or should only fail on misaligned arrays, which we should never see here.

Comment on lines +440 to +443
let stream = FlightDataEncoderBuilder::new()
.with_schema(get_db_schemas_schema())
.build(futures::stream::once(async { batch }))
.map_err(Status::from);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing that this code is also always the same, should we add a method build_stream?

@roeap
Copy link
Contributor Author

roeap commented Jun 1, 2023

@alamb - can we make it to get that in by tomorrow? 😄.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @roeap -- this is looking great. I am going to merge this as is and try to help with a small follow on with some additional docs.

.map_err(Status::from)?;
}
};
let mut builder = query.into_builder();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ that is very nice

)?)
}

if let Some(catalog_filter_name) = catalog_filter {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't matter here, but this can be implemented more efficiently (in a follow on PR perhaps) by checking equality in append

}

impl CommandGetTables {
pub fn into_builder(self) -> GetTablesBuilder {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the into_builder() function 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FLightSQL: add helpers to create CommandGetCatalogs, CommandGetSchemas, and CommandGetTables requests
3 participants