Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index the partitionValues map with column's physical name #278

Merged

Conversation

nicklan
Copy link
Collaborator

@nicklan nicklan commented Jul 11, 2024

From the protocol:

In name mode, readers must resolve columns in the data files by their physical names as given by the column metadata property delta.columnMapping.physicalName in the Delta schema. Partition values and column level statistics will also be resolved by their physical names. For columns that are not found in the files, nulls need to be returned. Column ids are not used in this mode for resolution purposes.

So we need to map to physical name otherwise we won't find the column. Note the same is true for id mode, so this won't break when we add support for that.

Copy link
Collaborator

@scovich scovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks good in isolation. My only question/concern would be -- how many other places are we using logical names where we should be using mapped physical names instead? Do we have a way to find them and prevent future mix-ups?

@nicklan
Copy link
Collaborator Author

nicklan commented Jul 15, 2024

how many other places are we using logical names where we should be using mapped physical names instead? Do we have a way to find them and prevent future mix-ups?

A good question. I did a grep through the code for field.name and didn't see anything else concerning. I will think if there's anything more systematic we could do. (something like using fields everywhere so that the method that wants the name could call the correct name vs logical_name could work, but would complicate other things, so not sure if it's worth it)

@nicklan nicklan merged commit faa553d into delta-io:main Jul 16, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants