-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support iceberg connector concurrent insertion #16983
Support iceberg connector concurrent insertion #16983
Conversation
...-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/file/FileHiveMetastore.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is the lock might not work properly when there are multiple presto clusters or other compute engines who are updating the certain table. I might propose to retry the failed insert ops rather than lock the table in presto. What do you think?
@beinan Thanks for the review! Actually there are two level lock to gurantee the |
Add a cherry pick info may be better @liupan664021 |
7512275
to
c7bee33
Compare
Got it, that makes a lot of sense. Thank you for the explanation! If it's a cherry-pick, could you link the original commit? thanks! |
@beinan @junyi1313 Sure, I'll add the cherry-pick info. |
c7bee33
to
add36c9
Compare
@beinan I tested concurrent insertions from two different presto cluster, and all the insertions finished without data missing. It seems the hive metastore lock and jvm lock work well. Without the lock, even concurrent insertions from same presto cluster may cause data missing. Any suggestion is appreciated :) |
...-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/file/FileHiveMetastore.java
Show resolved
Hide resolved
...-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/file/FileHiveMetastore.java
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Show resolved
Hide resolved
presto-iceberg/src/main/java/com/facebook/presto/iceberg/HiveTableOperations.java
Show resolved
Hide resolved
presto-iceberg/src/test/java/com/facebook/presto/iceberg/TestIcebergSmoke.java
Outdated
Show resolved
Hide resolved
add36c9
to
9d307df
Compare
@ChunxuTang could you also help take a look? Thanks! |
@ChunxuTang Could you help take a look of this PR? Thanks! |
@liupan664021 @beinan Glad to help! Will review it soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liupan664021 Thanks for your nice work!
For the commit, as Apache Iceberg and Presto are two different projects, I think it's not a cherry-pick commit, though you reuse some functionalities from there. I would suggest just linking the two Iceberg commits in your description of the PR.
...-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/file/FileHiveMetastore.java
Show resolved
Hide resolved
@beinan BTW, just for a double check: I think this feature works for the HiveMetastore mode but not the native folder mode, right? |
d3d88d6
to
6034f5f
Compare
Thanks for the suggestion! I've updated the commit message and rebased the branch. |
@beinan @ChunxuTang Could you help have a look of this PR, thanks! |
@beinan Could you help to merge this PR? Any comments would be appreciated~ Thanks a lot :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for my delay. it looks good to me except a couple of minor style issues.
@ChunxuTang @zhenxiao do you to wanna a second look?
...o-hive-metastore/src/main/java/com/facebook/presto/hive/metastore/ExtendedHiveMetastore.java
Outdated
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Show resolved
Hide resolved
presto-iceberg/src/main/java/com/facebook/presto/iceberg/HiveTableOperations.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you, @liupan664021
looks good to me
could you please confirm:
- this PR only references iceberg code changes. it is not a cherry-pick from trino or other projects
- with locking, multiple Presto clusters sharing the same metatore could support concurrent insert across Presto clusters, and if so, please add documentations for it, either in commit message, or the release notes
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Outdated
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Outdated
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Outdated
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Outdated
Show resolved
Hide resolved
...e-metastore/src/main/java/com/facebook/presto/hive/metastore/thrift/ThriftHiveMetastore.java
Outdated
Show resolved
Hide resolved
Hello @liupan664021, could you address Zhenxiao's comments? since this pr is super useful and has been there for a while, I would like to merge it sooner rather than later. Thanks a lot! |
6034f5f
to
abb7d44
Compare
Thanks for your replies! @zhenxiao @beinan
|
@liupan664021, @beinan This change is breaking our builds due to new methods added to HiveMetastore interface. Could you please add default implementation if that makes sense or revert this PR. |
@abhiseksaikia I will try to add a default implementation asap. |
@beinan @liupan664021 This fix does not seem to work for more than 5 concurrent queries fired at the same time for iceberg connector. Can you please highlight how did you guys test this and if there is any configuration tuning needed to support more than 5 concurrent queries? |
Before this pr, if multiple clients insert into same table concurrently, the insertion can be exexuted successfully but some insertion will lost. Because the snapshot replacement required a metastore lock to make sure the isolation. Also the native
HiveTableOperations
in iceberg implements the lock guard.