-
Notifications
You must be signed in to change notification settings - Fork 867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(flight): add helpers to handle CommandGetCatalogs
, CommandGetSchemas
, and CommandGetTables
requests
#4296
Conversation
Thanks @roeap -- I will review this carefully, but I probably won't have a chance until tomorrow or Tuesday |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @roeap -- I think this looks great!
I had some small API suggestions, but I don't think any are required (or I may propose them as a follow on PR, depending on your feedback).
Really appreciated 🙏
_request: Request<Ticket>, | ||
) -> Result<Response<<Self as FlightService>::DoGetStream>, Status> { | ||
Err(Status::unimplemented("do_get_tables not implemented")) | ||
let tables = TABLES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it is worth a comment that this does not implement full SQL style multipart identifier semantics to warn anyone who tries to use this code as a starting point for implementing FlightSQL? Maybe that will be obvious to anyone who uses this code, but we fought for quite a while to get consistent identifier semantics in DataFusion (one needs to parse it via SQL)
.collect::<HashSet<_>>(); | ||
|
||
let mut builder = GetSchemasBuilder::new(query.db_schema_filter_pattern); | ||
if let Some(catalog) = query.catalog { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the equality matching on catalog
should be the same for all implementations, what do you think about putting it in the builder itself?
The same comment applies to the GetTablesBuilder
too
We can do this as a follow on PR too (or never)
Something like:
let mut builder = GetSchemasBuilder::new(query.db_schema_filter_pattern);
if let Some(catalog) = query.catalog {
builder = builder.with_catalog_filter(catalog)
}
for (catalog_name, schema_name) in schemas {
builder
.append(catalog_name, schema_name)
.map_err(Status::from)?;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alamb, since we are then using all the fields on the command object, should we just convert one into the other?
Someting like given query
is CommandGetDbSchemas
.
let mut builder: GetSchemasBuilder = query.builder();
implementing from, with maybe the downside, that you would always actually have to cast the type..
let mut builder: GetSchemasBuilder = query.into();
impl GetTablesBuilder { | ||
/// Create a new instance of [`GetTablesBuilder`] | ||
/// | ||
/// The builder handles filtering by schema and table patterns, the caller |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned elsewhere I think it might be worth encapsulating all filtering in the Builders to make the API easier to use
Co-authored-by: Andrew Lamb <[email protected]>
* Improve docs and tests for SqlInfoList * Add an example/ * Update arrow-flight/src/sql/sql_info.rs Co-authored-by: Liang-Chi Hsieh <[email protected]> --------- Co-authored-by: Liang-Chi Hsieh <[email protected]>
* Improve docs and tests for SqlInfoList * Add an example/ * Update arrow-flight/src/sql/sql_info.rs Co-authored-by: Liang-Chi Hsieh <[email protected]> --------- Co-authored-by: Liang-Chi Hsieh <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took the liberty of fixing cargo fmt
and merging up from the master
branch.
@roeap shall we merge this PR as is, or do you plan to make any changes?
I wanted to follow up on your suggestions- integrating catalog filters etc. I'll leave it up to you if that should be here or in a followup 🙂 |
If you would like to do so, let's do it here so we don't accidentally release an API that is undergoing changes (I expect @tustvold will likely make the next release candidate on Friday, in 2 days. Thus if we merged this PR we would have a self imposed deadline of those 2 days to get subsequent changes in) |
Should be feasible to get it done by then. Also have a first draft for the xdbc stuff available- stretch goal .. 🙂 |
Thank you so much @roeap ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alamb - Added the additional filtering. This also made it quite trivial to convert the request directly into the builder, which makes for quite a convenient API - I think. So worthwhile eeffort in my opinion :).
mod tables; | ||
|
||
/// Returns the RecordBatch for | ||
pub fn get_catalogs_batch(mut catalog_names: Vec<String>) -> Result<RecordBatch> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a quick builder implementation - consistency is always nice ...
/// Append a row | ||
pub fn append( | ||
&mut self, | ||
catalog_name: impl AsRef<str>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not completely sure how this would be expected. The filter distinguishes None
which means no filtering vs ""
which means empty catalog. So in order to specify an empty catalog, one would pass in "". Thought about making this optional, and then testing for nulls in the filter, but in the end I felt its trivial to default to empty string for the caller, so handling nulls in the catalog array is not necessary.
.collect::<std::result::Result<Vec<_>, _>>()? | ||
.into_iter() | ||
// We know the arrays are of same length as they are produced fromn the same root array | ||
.reduce(|filter, arr| or(&filter, &arr).unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or
should only fail on misaligned arrays, which we should never see here.
let stream = FlightDataEncoderBuilder::new() | ||
.with_schema(get_db_schemas_schema()) | ||
.build(futures::stream::once(async { batch })) | ||
.map_err(Status::from); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing that this code is also always the same, should we add a method build_stream
?
@alamb - can we make it to get that in by tomorrow? 😄. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @roeap -- this is looking great. I am going to merge this as is and try to help with a small follow on with some additional docs.
.map_err(Status::from)?; | ||
} | ||
}; | ||
let mut builder = query.into_builder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️ that is very nice
)?) | ||
} | ||
|
||
if let Some(catalog_filter_name) = catalog_filter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't matter here, but this can be implemented more efficiently (in a follow on PR perhaps) by checking equality in append
} | ||
|
||
impl CommandGetTables { | ||
pub fn into_builder(self) -> GetTablesBuilder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the into_builder()
function 👍
Which issue does this PR close?
Closes #4295
Rationale for this change
see #4295
What changes are included in this PR?
The main builder implementations were lifted from IOx, but altered to also include sorting the created record batches.
cc @alamb @avantgardnerio
Are there any user-facing changes?
Users can use the added Builders and methods to create responses to
CommandGetCatalogs
,CommandGetSchemas
, andCommandGetTables
requests.