-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Hudi merged view files for partition path updates without compaction #24283
base: master
Are you sure you want to change the base?
Conversation
@@ -1647,6 +1648,19 @@ public boolean isHudiMetadataEnabled() | |||
return this.hudiMetadataEnabled; | |||
} | |||
|
|||
@Config("hive.hudi-tables-use-merged-view") | |||
@ConfigDescription("For Hudi tables prefer to fetch the list of files from the merged file system view") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the description should be more clear about the type of value it's looking for. The type is a string, so for users it should be more explicit what the value should be. When I read this I initially thought it should be a boolean.
Based on looking at the tests it looks like it needs a dot-separated name of schema and table? If so, it should be clear about which tables should be added here
@ConfigDescription("For Hudi tables prefer to fetch the list of files from the merged file system view") | |
@ConfigDescription("For Hudi tables, A comma-separated list in the form of <schema>.<table> which should prefer to fetch the list of files from the merged file system view") |
@@ -608,6 +609,11 @@ public HiveSessionProperties(HiveClientConfig hiveClientConfig, OrcFileWriterCon | |||
"For Hudi tables prefer to fetch the list of file names, sizes and other metadata from the internal metadata table rather than storage", | |||
hiveClientConfig.isHudiMetadataEnabled(), | |||
false), | |||
stringProperty( | |||
HUDI_TABLES_USE_MERGED_VIEW, | |||
"For Hudi tables, use merged view to read data", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please update the description here as well
Description
Support Hudi merged view files for partition path updates without compaction. This is needed with Merge-on-Read Hudi tables when partition has been updated for a record and the table has not been compacted yet.
schemaName.tableName
) where this support is needed.HudiDirectoryLister
.Motivation and Context
Support Hudi merged view files for partition path updates without compaction. This is needed with Merge-on-Read Hudi tables when partition has been updated for a record and the table has not been compacted yet.
Impact
Support Hudi merged view files for partition path updates without compaction. This is needed with Merge-on-Read Hudi tables when partition has been updated for a record and the table has not been compacted yet. As a result, even the read-optimized view of Merge-on-Read tables under partition path updates without compaction should not return any duplicates.
Test Plan
Contributor checklist
Release Notes