-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read in the Iceberg metadata #28
Comments
@Fokko Thanks for writing up. I think we are quite close to defining table metadata. About json schema, I think the idea is quite great, and there exists some rust tools for it: But I haven't used them before, so no idea how much help they can provide. cc @JanKaul @Xuanwo Any ideas. |
If no one else is interested I could prepare a PR for the table metadata. Regarding the |
Every query in Iceberg starts with the metadata. This is the JSON file that's created at each commit on an Iceberg table.
There are two versions (number three is underway):
What I would suggest is reading both V1 and V2 and merging them into a common structure in memory. This includes merging some fields:
schemas
is optional in V1, andschema
is removed in V2. For V1 only the current schema was kept, but for V2 all the historical schemas are preserved as well. When reading a V1 table, the schema fromschema
would be added toschemas
, and it would set thecurrent-schema-id
to the newly added schema.partition-specs
main
ref to therefs
dict, pointing to the current snapshot.There are also example manifests available from the Java repository: https://github.com/apache/iceberg/tree/master/core/src/test/resources
Ps. on a tangent, but related, I'm also thinking of creating a jsonschema, would that be helpful for rust?
The text was updated successfully, but these errors were encountered: