-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make partitions loading for iceberg table lazy to avoid unnecessary loading #23645
base: master
Are you sure you want to change the base?
Make partitions loading for iceberg table lazy to avoid unnecessary loading #23645
Conversation
9028384
to
e390db7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the motivation for this, but I am a little concerned about the other parts which implement the limiting of the partition loading. It feels a little convoluted, especially requiring checking the instance type of the iterable to set the max number of partitions to iterate. I feel like it could be simplified by just creating an iterable implementation which accepts the limit in the first place, rather than updating it after the fact. I may be missing something in my understanding if that isn't a feasible solution
The current implementation of partition loading in IcebergPartitionLoader
looks good though.
|
||
import static java.util.Objects.requireNonNull; | ||
|
||
public class LazyLoadedPartitions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this class itself should just implement Iterable
. Is there a reason we need both LazyLoadedPartitions
and LazyPartitionsIterable
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion, after recheck the code I think it's feasible and reasonable. Fixed!
*/ | ||
package com.facebook.presto.spi; | ||
|
||
public interface LazyIterable<T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like interface makes the implementation more complex. Is there a reason we need to have the max iteration count mutable? I feel like having this mutable iteration count is a bit of an anti-pattern. What if instead we use something like limit
from the stream API to prevent loading too many items? e.g.
Stream.of(new Iterable<>(...){}).limit(<N>)...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option could be adding a new field inside of the DiscretePredicate
class with the maximum number of predicates to iterate over? Then we don't have to rely on the casting + lazy iterable.
Another alternative could be putting a limit on the iterator we pass to the DiscretePredicates
constructor too, rather than even having to modify the DiscretePredicate
class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review and feedback, the entire thinking is described as follows, please let me know if there is anything unreasonable.
There are several points to consider when implementing setting max threshold for partitions lazy loading:
-
If any delete files are found during the file scanning process, the scanning and loading of partition values would be no longer continue, but instead a single value representing unpartitioned should be returned. So we cannot just scan a part of files to get a specified number of available partition values. So it seems that we cannot set the max threshold and terminate the scan in advance through methods such like
Stream.of(new Iterable<>(...){}).limit(<N>)...
. -
The max threshold can be passed into
IcebergPartitionLoader
and used there during the loading to terminate the entire scanning process. Because during the loading process, even if no delete files are encountered, when the number of loaded available partitions exceeds the max threshold, we can terminate directly and return a value representing unpartitioned. So we have to figure out a way to passmax threshold
intoIcebergPartitionLoader
. -
The construction of the table layout, that is, the lazy partitions and
DiscretePredicates
information is built earlier than the actual loading. Meanwhile, the value of max threshold can only be determined when actual loading is required. For example, inIcebergMetadataOptimizer
, if further reducible optimization is available, we do not limit the max number of partitions loaded; otherwise, the max number of loaded partitions is set 1000 by default. So it seems that we need to support this delay setting of maximum threshold for the partition loader. -
LazyLoadedPartitions
andPartitionLoader
are both defined inpresto-hive-common
, whileDiscretePredicates
is defined inpresto-spi
, which do not depend onpresto-hive-common
. As mentioned above, we need a way to set the max threshold intoPartitionLoader
throughDiscretePredicates
before actual loading. So we define an interface inpresto-spi
and letLazyLoadedPartitions
inpresto-hive-common
to implement it, in this way, max threshold can be finally passed intoPartitionLoader
throughLazyLoadedPartitons
.
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergPartitionLoader.java
Outdated
Show resolved
Hide resolved
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergPartitionLoader.java
Outdated
Show resolved
Hide resolved
e390db7
to
827feb5
Compare
827feb5
to
819c643
Compare
Thanks for the release note entry! Minor suggestion to help follow the Order of changes in the Release Note Guidelines.
|
@steveburnett Thanks for your suggestion, fixed! Please take a look when convenient. |
@hantangwangd LGTM, thanks! |
819c643
to
ce6b43f
Compare
Hi, new release note guidelines. Please remove the manual PR link in the following format from the release note entries for this PR.
I have updated the Release Notes Guidelines to remove the examples of manually adding the PR link. |
Oh sorry forgot that. Thanks for your reminder @steveburnett, fixed! |
Description
Currently, when querying Iceberg tables, we always eagerly load all partition values of the table in
PickTableLayout
orIcebergFilterPushdown
during the optimization phase. Due to the fact that this eagerly loaded partition values are currently only used in metadata based optimization rules, in many cases this information is not used at all. It can result in a lot of waste of resources and performance in the following cases:This PR makes the loading behavior of partitions lazy and support setting a threshold for the maximum number of partitions that can be loaded during the loading phase. In this way, we can avoid a lot of unnecessary loading in many scenarios, as well as the resulting resource consumption and performance loss.
The benchmark's results also support the above conclusion. We execute regular query statements, query statements which are applicable for metadata optimization, and query statements which are applicable for further reducible optimization on tables with two different partition numbers. Among them,
300 * 3
will make the table contain 900 partitions (not reaching the default threshold 1000), while400 * 4
will make the table contain 1600 partitions (exceeding the default threshold of 1000). The code can be viewed here: https://github.com/hantangwangd/presto/blob/benchmark_for_lazy_load/presto-iceberg/src/test/java/com/facebook/presto/iceberg/BenchmarkIcebergQuery.javaThe benchmark test result before this change is as follows:
While the benchmark test result after this change is as follows:
Due to the issues mentioned above, we found that this change significantly improves the performance of queries that are not suitable for metadata optimization, while for statements that can be optimized based on metadata, the performance also improves when the number of partitions exceeds the threshold. This is in line with expectations, and it can be anticipated that as the number of partitions increases, the performance improvement will further increase.
Motivation and Context
Make partitions loading for iceberg table lazy to avoid unnecessary loading
Impact
N/A
Test Plan
TestIcebergLogicalPlanner
to show the behaviors with different max partition thresholdsContributor checklist
Release Notes