Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add views support in iceberg connector #19872

Merged
merged 3 commits into from
Oct 5, 2023

Conversation

pratyakshsharma
Copy link
Contributor

@pratyakshsharma pratyakshsharma commented Jun 13, 2023

Cherry-pick of trinodb/trino@4b4e4f1

Co-authored-by: losipiuk

== RELEASE NOTES ==

Iceberg Changes
* Support view creation via Iceberg connector

@pratyakshsharma pratyakshsharma requested a review from a team as a code owner June 13, 2023 19:33
Copy link
Contributor

@yingsu00 yingsu00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pratyakshsharma Can you please add the cherry-pick information in the commit and PR messages as stated in https://github.com/prestodb/presto/wiki/Review-and-Commit-guidelines#example-commit-message ?

@ChunxuTang
Copy link
Member

Hi @pratyakshsharma, thanks for your nice contribution!
Just a quick question: For now, the Iceberg connector supports Hive metastore catalog, Hadoop catalog, and Nessie catalog. Will your PR support view creation in all these catalogs or only Hive metastore?

@pratyakshsharma
Copy link
Contributor Author

@ChunxuTang this PR only supports views via hive catalog.

@pratyakshsharma pratyakshsharma force-pushed the iceberg-view branch 2 times, most recently from b67abc4 to 243c9c1 Compare June 27, 2023 05:41
@pratyakshsharma
Copy link
Contributor Author

@yingsu00 This is good to review, all the tests are passing, I will retrigger the cancelled test after your review.

Copy link
Contributor

@yingsu00 yingsu00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Could you please add the original PR(s) link and original authors in the PR message as well?
  2. Add another commit to back port trinodb/trino@52a2534 before backporting the views changes
  3. The TestIcebergMetadataListing tests change was not backported.
  4. TestIcebergHiveMetadataListing was not backported. Please find the original PR/commit that creates this test and backport it as separate commit(s) before this one.

@yingsu00 yingsu00 self-assigned this Jun 30, 2023
@pratyakshsharma pratyakshsharma force-pushed the iceberg-view branch 2 times, most recently from fe95211 to 5c8dc37 Compare July 19, 2023 13:10
MetastoreContext metastoreContext = getMetastoreContext(session);
ColumnConverter columnConverter = metastoreContext.getColumnConverter();

for (ColumnMetadata column : viewMetadata.getColumns()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this part of code come from? Strangely, I saw the original commit just added a dummy column. @imjalpreet @agrawalreetika Can you please confirm if this is correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pratyakshsharma Are we adding actual column names here, to keep metadata updated in HMS for views? Or for any other reason?
Do we know in the original commit since its adding dummy column in Table object, how does the metadata looks like there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I have followed the way views are created in hive connector in presto. Ultimately, both (view from trino and presto) have the same metadata in HMS as checked in iceberg.information_schema.columns table

@@ -206,6 +207,10 @@ public TableMetadata refresh()
throw new UnknownTableTypeException(getSchemaTableName());
}

if ("true".equalsIgnoreCase(table.getParameters().get(PRESTO_VIEW_FLAG))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these test failures from previous commits? If so, these fixes need to be in the same commit that introduces the test failure. A general good practice is that there should not be any commit breaks the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, have taken care of it.

row(storageTable, "_string"),
row(storageTable, "_integer"));
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pratyakshsharma Could you please confirm, why are we adding IcebergHiveMetadataListing related tests in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original trino commit which is backported for adding views support had changes in this class. Hence I had to backport this class as well so as to have proper test coverage. Also Ying had suggested doing that here #19872 (review).

.iterator());
onPresto().executeQuery("CREATE TABLE hive.default.hive_table (_double DOUBLE)");
onPresto().executeQuery("CREATE VIEW hive.default.hive_view AS SELECT * FROM hive.default.hive_table");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we creating hive tables and hive views here? Shouldn't it be iceberg tables & views for Iceberg testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MetastoreContext metastoreContext = getMetastoreContext(session);
ColumnConverter columnConverter = metastoreContext.getColumnConverter();

for (ColumnMetadata column : viewMetadata.getColumns()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pratyakshsharma Are we adding actual column names here, to keep metadata updated in HMS for views? Or for any other reason?
Do we know in the original commit since its adding dummy column in Table object, how does the metadata looks like there?

@imjalpreet
Copy link
Member

Based on the test suite logs, it is failing when trying to run the profile singlenode-kerberos-hdfs-impersonation, so can we add another iceberg.properties in the respective directory similar to hive.properties.

.add(row("iceberg_table1", "_integer"))
.add(row("iceberg_view", "_string"))
.add(row("iceberg_view", "_integer"))
.add(row("hive_view", "_double"))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imjalpreet Thank you for the valuable suggestions, product tests are passing finally. I need to verify if we want hive_view to be listed as part of the above query? I feel this is not the desired behaviour.
Trino takes care of this already - https://github.com/trinodb/trino/pull/8153/files#diff-d45108e44a12da55f3fbda8c8cb9f3ea323ae3caa7f4dddd20fa2535b499a31e

@pratyakshsharma
Copy link
Contributor Author

@yingsu00 TestIcebergHiveMetadataListing was introduced as part of this commit in trino - https://github.com/trinodb/trino/pull/8153/commits/35316765099272863f009bbb2ced5d6b4d08aee4#diff-d45108e44a12da55f3fbd[…]3caa7f4dddd20fa2535b499a31e. I have added this as a separate commit but have not included the cherry pick information since I did not backport it completely.

@yingsu00
Copy link
Contributor

@wanglinsong Do you know how to start the 3 yellow tests?

@wanglinsong
Copy link
Member

@wanglinsong Do you know how to start the 3 yellow tests?

Looks like tests are stuck on circle ci. I don’t have that access.

@imjalpreet
Copy link
Member

I think we can do a final review and force push to retrigger the tests. These tests were not triggered in circle ci from what I can see.

@pratyakshsharma pratyakshsharma force-pushed the iceberg-view branch 4 times, most recently from 0a99fa5 to c0f8f3e Compare September 19, 2023 19:50
@pratyakshsharma
Copy link
Contributor Author

@yingsu00 @agrawalreetika @imjalpreet Test failures are unrelated. Please take a pass.

@pratyakshsharma pratyakshsharma force-pushed the iceberg-view branch 2 times, most recently from 807fdd3 to b95158d Compare September 20, 2023 17:33
@yingsu00
Copy link
Contributor

yingsu00 commented Sep 20, 2023

Details

Please make sure all tests pass before requesting reviews. Thank you. I reran the tests for you this time, in the future, you can try to re-push to trigger the new run.

@pratyakshsharma
Copy link
Contributor Author

@yingsu00 All checks have passed.

Copy link
Member

@agrawalreetika agrawalreetika left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from minor nits, overall looks good to me now.

@pratyakshsharma pratyakshsharma force-pushed the iceberg-view branch 2 times, most recently from 7d1c9ed to 8262d80 Compare September 28, 2023 10:57
@yingsu00
Copy link
Contributor

The product tests failed with "failed to register layer: write /usr/libexec/mysqld: no space left on device"
@wanglinsong How do we solve the problem?

Copy link
Contributor

@yingsu00 yingsu00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending production tests

@skairali
Copy link
Member

skairali commented Oct 1, 2023

@yingsu00 this is a required fix for all enterprise presto consumers

@agrawalreetika
Copy link
Member

@pratyakshsharma Rebase your PR on master, this #21000 may help for product case space related failures.

Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a couple of small suggestions to consider for the docs. Let me know what you think of them, please.

presto-docs/src/main/sphinx/connector/iceberg.rst Outdated Show resolved Hide resolved
presto-docs/src/main/sphinx/connector/iceberg.rst Outdated Show resolved Hide resolved
@pratyakshsharma pratyakshsharma force-pushed the iceberg-view branch 2 times, most recently from 35459ce to 74569fe Compare October 3, 2023 11:50
Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requested docs change present and verified in local docs build.

LGTM! (docs)

@yingsu00 yingsu00 merged commit 154ceeb into prestodb:master Oct 5, 2023
@wanglinsong wanglinsong mentioned this pull request Dec 8, 2023
26 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
iceberg Apache Iceberg related
Projects
None yet
Development

Successfully merging this pull request may close these issues.