-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for list types? #309
Comments
This is definitely a bug, nested types like |
### Rationale for this change Discovered while fixing apache/iceberg-go#309 we didn't correctly propagate the field-id metadata to children of List or Map fields, only structs. ### What changes are included in this PR? A new MapType creator for constructing MapTypes from arrow fields for the Key and Items for easier construction, and fixing the `pqarrow` schema manifest creation to correctly propagate the child metadata field IDs for the children. ### Are these changes tested? Unit test is added. ### Are there any user-facing changes? Usage of pqarrow reading List/Map typed fields will now correctly contain the `PARQUET:field_id` metadata key in the schema produced.
@GabrielM98 please take a look at the linked PR and confirm for me that it solves your problem? |
LGTM @zeroshade 👍 I'm getting some repeated warn level logs from the AWS SDK (see below), but other than that it works as expected. Thanks for the quick fix!
|
Yea i'm seeing the same warnings, i think it's related to apache/iceberg#12264 and I might have to disable the strong integrity checksum |
Apache Iceberg version
v0.1.0
Please describe the bug 🐞
Does the library support scanning tables with fields of type
list
?I'm seeing some strange behaviour whilst attempting to scan a table (with all fields selected and no row filters applied) with the following schema:
When I call
(*table.Scan).ToArrowRecords
and attempt to loop over the resulting iterator, the loop yields nothing.Hooking up a debugger to my code, I can see there's an error being returned by
(*table.Scan).recordsFromTask
(here) which is resulting in the context being cancelled. Hence, the iterator returns without yielding anything. However, on some occasions it does yield an error, which seems to indicate that there's a race condition between the write to the done channel of thecontext.Context
and the write to theout
channel in(*table.Scan).recordsFromTask
(here).Race condition aside, the error being returned is the following...
I've been doing a bit of digging and noticed an intriguing bit of behaviour with regard to the projected field IDs. It appears that if the field is of type
map
orlist
that it doesn't get added to the set of projected field IDs (seeswitch
statement here)? Is this a piece of functionality that is yet to be implemented or is this intended behaviour? Thanks.The text was updated successfully, but these errors were encountered: