Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Iceberg]Support renaming table on hive file catalog #24312

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

hantangwangd
Copy link
Member

@hantangwangd hantangwangd commented Jan 1, 2025

Description

This PR support renaming table behavior for iceberg tables on hive file catalog.

Motivation and Context

Support renaming table for Iceberg connector on as many catalog types as possible

Impact

Iceberg connector configured with hive file catalogs can now support renaming table

Test Plan

  • Enable existing test cases for renaming table on HIVE file catalog

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== RELEASE NOTES ==

Iceberg Connector Changes
* Add support of ``renaming table`` for Iceberg connector when configured with ``HIVE`` file catalog. :pr:`24312`

@hantangwangd hantangwangd force-pushed the support_hive_catalog_rename_table branch 2 times, most recently from 7cd47c9 to 366408c Compare January 1, 2025 12:02
@hantangwangd hantangwangd marked this pull request as ready for review January 1, 2025 17:21
@hantangwangd hantangwangd requested review from ZacBlanco and a team as code owners January 1, 2025 17:21
@hantangwangd hantangwangd requested review from presto-oss, agrawalreetika and tdcmeehan and removed request for presto-oss January 1, 2025 17:21
try {
if (!metadataFileSystem.rename(getTableMetadataDirectory(databaseName, tableName), getTableMetadataDirectory(newDatabaseName, newTableName))) {
throw new PrestoException(HIVE_METASTORE_ERROR, "Could not rename table directory");
if (!isIcebergTable(table)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - How about we put the positive condition in if-else instead since the implementation is valid for tables and views both :

if (isIcebergTable) {}
else {}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
else {
// If the directory `.prestoPermissions` exists, copy it to the new table metadata directory
if (metadataFileSystem.exists(new Path(metadataDirectory, PRESTO_PERMISSIONS_DIRECTORY_NAME))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I think it would be better to move Iceberg logic handle in different method and call that from here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, I have refactored the code, extracted two methods to make it more concise. Please take a look when available.

ZacBlanco
ZacBlanco previously approved these changes Jan 3, 2025
Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the changes are fine, however the code complexity seems high due to all the nested conditionals. I would consider factoring out some methods or subclassing as mentioned in my comment.

try {
if (!metadataFileSystem.rename(getTableMetadataDirectory(databaseName, tableName), getTableMetadataDirectory(newDatabaseName, newTableName))) {
throw new PrestoException(HIVE_METASTORE_ERROR, "Could not rename table directory");
if (!isIcebergTable(table)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to do it for this PR, but all these conditionals make me think that it is worth considering to subclass the FileHiveMetastore and create a FileIcebergHiveMetastore or something similar to capture the differences in behavior. It seems a bit messy currently.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, seems there are indeed some implementation details that differ between them and need to be special handled. It is worth considering to put these different details into Iceberg's own subclass.

@steveburnett
Copy link
Contributor

Thanks for the release note! Nit of formatting suggestion:

== RELEASE NOTES ==

Iceberg Connector Changes
* Add support of ``renaming table`` for Iceberg connector when configured with ``HIVE`` file catalog. :pr:`24312`

@hantangwangd
Copy link
Member Author

Thanks @steveburnett, fixed. Please take a look when available.

Copy link
Member

@agrawalreetika agrawalreetika left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hantangwangd Can we add some tests in iceberg which tests these newly added code lines?

{
try {
if (!metadataFileSystem.rename(originalMetadataDirectory, newMetadataDirectory)) {
throw new PrestoException(HIVE_METASTORE_ERROR, "Could not rename table directory");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We can add some more details in exception here around which table directory is being renamed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!


// Rename file `.prestoSchema` to change it to the new metadata path
// This will atomically execute the table renaming behavior
if (!metadataFileSystem.rename(new Path(originalMetadataDirectory, PRESTO_SCHEMA_FILE_NAME), new Path(newMetadataDirectory, PRESTO_SCHEMA_FILE_NAME))) {
throw new PrestoException(HIVE_METASTORE_ERROR, "Could not rename table directory");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We can add some more details in exception here around which table directory is being renamed

if (metadataFileSystem.exists(new Path(originalMetadataDirectory, PRESTO_PERMISSIONS_DIRECTORY_NAME))) {
if (!FileUtil.copy(metadataFileSystem, new Path(originalMetadataDirectory, PRESTO_PERMISSIONS_DIRECTORY_NAME),
metadataFileSystem, new Path(newMetadataDirectory, PRESTO_PERMISSIONS_DIRECTORY_NAME), false, metadataFileSystem.getConf())) {
throw new PrestoException(HIVE_METASTORE_ERROR, "Could not rename table directory");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We can add some more details in exception here around which table directory is being renamed

@hantangwangd hantangwangd force-pushed the support_hive_catalog_rename_table branch from 4a0b64d to e3fa5dc Compare January 5, 2025 17:19
@hantangwangd hantangwangd marked this pull request as draft January 5, 2025 17:21
@hantangwangd hantangwangd force-pushed the support_hive_catalog_rename_table branch from e3fa5dc to 7330926 Compare January 6, 2025 03:44
@hantangwangd hantangwangd marked this pull request as ready for review January 6, 2025 07:29
@hantangwangd
Copy link
Member Author

Can we add some tests in iceberg which tests these newly added code lines?

I constructed a test environment based on a customized ExtendedFileSystem which can simulate the scenarios where certain file operations fail. In this way, I added some test cases to test these newly added code branches. Please take a look when available. And open to any better test solutions. @agrawalreetika @ZacBlanco

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants