Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support iceberg connector concurrent insertion #16983

Merged

Conversation

liupan664021
Copy link
Contributor

@liupan664021 liupan664021 commented Nov 12, 2021

Before this pr, if multiple clients insert into same table concurrently, the insertion can be exexuted successfully but some insertion will lost. Because the snapshot replacement required a metastore lock to make sure the isolation. Also the native HiveTableOperations in iceberg implements the lock guard.

== RELEASE NOTES ==

Iceberg Changes
* Support concurrent insertion from the same presto cluster or mutiple presto clusters which sharing the same metastore

Copy link
Member

@beinan beinan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is the lock might not work properly when there are multiple presto clusters or other compute engines who are updating the certain table. I might propose to retry the failed insert ops rather than lock the table in presto. What do you think?

@liupan664021
Copy link
Contributor Author

liupan664021 commented Nov 16, 2021

My concern is the lock might not work properly when there are multiple presto clusters or other compute engines who are updating the certain table. I might propose to retry the failed insert ops rather than lock the table in presto. What do you think?

@beinan Thanks for the review! Actually there are two level lock to gurantee the doCommit operation: a jvm level lock and a hive metastore lock acquired by a ThriftHiveMetastoreClient#lock. The lock part is similar to the iceberg native HiveTableOperations

@junyi1313
Copy link
Contributor

My concern is the lock might not work properly when there are multiple presto clusters or other compute engines who are updating the certain table. I might propose to retry the failed insert ops rather than lock the table in presto. What do you think?

@beinan Thanks for the review! Actually there are two level lock to gurantee the doCommit operation: a jvm level lock and a hive metastore lock acquired by a ThriftHiveMetastoreClient#lock. The lock part is similar to the iceberg native HiveTableOperations

Add a cherry pick info may be better @liupan664021

@liupan664021 liupan664021 force-pushed the support-iceberg-concurrent-insertion branch from 7512275 to c7bee33 Compare November 17, 2021 07:04
@liupan664021 liupan664021 reopened this Nov 17, 2021
@beinan
Copy link
Member

beinan commented Nov 18, 2021

My concern is the lock might not work properly when there are multiple presto clusters or other compute engines who are updating the certain table. I might propose to retry the failed insert ops rather than lock the table in presto. What do you think?

@beinan Thanks for the review! Actually there are two level lock to gurantee the doCommit operation: a jvm level lock and a hive metastore lock acquired by a ThriftHiveMetastoreClient#lock. The lock part is similar to the iceberg native HiveTableOperations

Add a cherry pick info may be better @liupan664021

My concern is the lock might not work properly when there are multiple presto clusters or other compute engines who are updating the certain table. I might propose to retry the failed insert ops rather than lock the table in presto. What do you think?

@beinan Thanks for the review! Actually there are two level lock to gurantee the doCommit operation: a jvm level lock and a hive metastore lock acquired by a ThriftHiveMetastoreClient#lock. The lock part is similar to the iceberg native HiveTableOperations

Got it, that makes a lot of sense. Thank you for the explanation!

If it's a cherry-pick, could you link the original commit? thanks!

@liupan664021
Copy link
Contributor Author

@beinan @junyi1313 Sure, I'll add the cherry-pick info.

@liupan664021 liupan664021 force-pushed the support-iceberg-concurrent-insertion branch from c7bee33 to add36c9 Compare November 18, 2021 13:15
@liupan664021
Copy link
Contributor Author

liupan664021 commented Nov 22, 2021

My concern is the lock might not work properly when there are multiple presto clusters or other compute engines who are updating the certain table. I might propose to retry the failed insert ops rather than lock the table in presto. What do you think?

@beinan I tested concurrent insertions from two different presto cluster, and all the insertions finished without data missing. It seems the hive metastore lock and jvm lock work well. Without the lock, even concurrent insertions from same presto cluster may cause data missing. Any suggestion is appreciated :)

@liupan664021 liupan664021 force-pushed the support-iceberg-concurrent-insertion branch from add36c9 to 9d307df Compare November 24, 2021 09:28
@liupan664021 liupan664021 reopened this Nov 26, 2021
@beinan
Copy link
Member

beinan commented Dec 1, 2021

@ChunxuTang could you also help take a look? Thanks!

@liupan664021
Copy link
Contributor Author

@ChunxuTang could you also help take a look? Thanks!

@ChunxuTang Could you help take a look of this PR? Thanks!

@ChunxuTang
Copy link
Member

@liupan664021 @beinan Glad to help! Will review it soon.

Copy link
Member

@ChunxuTang ChunxuTang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liupan664021 Thanks for your nice work!
For the commit, as Apache Iceberg and Presto are two different projects, I think it's not a cherry-pick commit, though you reuse some functionalities from there. I would suggest just linking the two Iceberg commits in your description of the PR.

@ChunxuTang
Copy link
Member

@beinan BTW, just for a double check: I think this feature works for the HiveMetastore mode but not the native folder mode, right?

@liupan664021 liupan664021 force-pushed the support-iceberg-concurrent-insertion branch 2 times, most recently from d3d88d6 to 6034f5f Compare December 15, 2021 08:36
@liupan664021 liupan664021 reopened this Dec 15, 2021
@liupan664021
Copy link
Contributor Author

@liupan664021 Thanks for your nice work! For the commit, as Apache Iceberg and Presto are two different projects, I think it's not a cherry-pick commit, though you reuse some functionalities from there. I would suggest just linking the two Iceberg commits in your description of the PR.

Thanks for the suggestion! I've updated the commit message and rebased the branch.

@liupan664021
Copy link
Contributor Author

@beinan @ChunxuTang Could you help have a look of this PR, thanks!

@liupan664021
Copy link
Contributor Author

@beinan Could you help to merge this PR? Any comments would be appreciated~ Thanks a lot :)

Copy link
Member

@beinan beinan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for my delay. it looks good to me except a couple of minor style issues.

@ChunxuTang @zhenxiao do you to wanna a second look?

Copy link
Collaborator

@zhenxiao zhenxiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, @liupan664021
looks good to me
could you please confirm:

  • this PR only references iceberg code changes. it is not a cherry-pick from trino or other projects
  • with locking, multiple Presto clusters sharing the same metatore could support concurrent insert across Presto clusters, and if so, please add documentations for it, either in commit message, or the release notes

@beinan
Copy link
Member

beinan commented Feb 9, 2022

thank you, @liupan664021 looks good to me could you please confirm:

  • this PR only references iceberg code changes. it is not a cherry-pick from trino or other projects
  • with locking, multiple Presto clusters sharing the same metatore could support concurrent insert across Presto clusters, and if so, please add documentations for it, either in commit message, or the release notes

Hello @liupan664021, could you address Zhenxiao's comments? since this pr is super useful and has been there for a while, I would like to merge it sooner rather than later. Thanks a lot!

@liupan664021 liupan664021 force-pushed the support-iceberg-concurrent-insertion branch from 6034f5f to abb7d44 Compare February 9, 2022 03:25
@liupan664021
Copy link
Contributor Author

thank you, @liupan664021 looks good to me could you please confirm:

  • this PR only references iceberg code changes. it is not a cherry-pick from trino or other projects
  • with locking, multiple Presto clusters sharing the same metatore could support concurrent insert across Presto clusters, and if so, please add documentations for it, either in commit message, or the release notes

Hello @liupan664021, could you address Zhenxiao's comments? since this pr is super useful and has been there for a while, I would like to merge it sooner rather than later. Thanks a lot!

Thanks for your replies! @zhenxiao @beinan

  1. This PR only references the iceberg code changes, not a cherry-pick from any other project including Trino.
  2. I tested concurrent insertions from two presto cluster which sharing the same metastore, and it works well. Also, I added the infomation to the Release notes.

@beinan beinan merged commit 59d092c into prestodb:master Feb 23, 2022
@abhiseksaikia
Copy link
Contributor

@liupan664021, @beinan This change is breaking our builds due to new methods added to HiveMetastore interface. Could you please add default implementation if that makes sense or revert this PR.

@beinan
Copy link
Member

beinan commented Feb 23, 2022

@abhiseksaikia I will try to add a default implementation asap.

@pratyakshsharma
Copy link
Contributor

@beinan @liupan664021 This fix does not seem to work for more than 5 concurrent queries fired at the same time for iceberg connector. Can you please highlight how did you guys test this and if there is any configuration tuning needed to support more than 5 concurrent queries?
cc @tdcmeehan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants