Make partitions loading for iceberg table lazy to avoid unnecessary loading #23645

hantangwangd · 2024-09-13T04:15:59Z

Description

Currently, when querying Iceberg tables, we always eagerly load all partition values of the table in PickTableLayout or IcebergFilterPushdown during the optimization phase. Due to the fact that this eagerly loaded partition values are currently only used in metadata based optimization rules, in many cases this information is not used at all. It can result in a lot of waste of resources and performance in the following cases:

For queries that cannot be optimized based on metadata, we do not use these partition values at all, so we shouldn't load them eagerly.
For tables with a huge number of partitions that are not suitable for metadata optimization, we need to limit the max number that can be loaded in the loading phase, rather than loading all of them first and then determine whether they exceed the threshold.

This PR makes the loading behavior of partitions lazy and support setting a threshold for the maximum number of partitions that can be loaded during the loading phase. In this way, we can avoid a lot of unnecessary loading in many scenarios, as well as the resulting resource consumption and performance loss.

The benchmark's results also support the above conclusion. We execute regular query statements, query statements which are applicable for metadata optimization, and query statements which are applicable for further reducible optimization on tables with two different partition numbers. Among them, 300 * 3 will make the table contain 900 partitions (not reaching the default threshold 1000), while 400 * 4 will make the table contain 1600 partitions (exceeding the default threshold of 1000). The code can be viewed here: https://github.com/hantangwangd/presto/blob/benchmark_for_lazy_load/presto-iceberg/src/test/java/com/facebook/presto/iceberg/BenchmarkIcebergQuery.java

The benchmark test result before this change is as follows:


----Before this change, always load partitions eagerly----

Benchmark                                  (recordCount)  Mode  Cnt     Score     Error  Units
BenchmarkIcebergQuery.testFurtherOptimize        300 * 3  avgt   10   154.888 ±   2.762  ms/op
BenchmarkIcebergQuery.testFurtherOptimize        400 * 4  avgt   10   201.214 ±   5.199  ms/op
BenchmarkIcebergQuery.testNormalQuery            300 * 3  avgt   10   673.455 ±  26.239  ms/op
BenchmarkIcebergQuery.testNormalQuery            400 * 4  avgt   10   907.111 ± 103.223  ms/op
BenchmarkIcebergQuery.testOptimize               300 * 3  avgt   10   252.138 ±   6.873  ms/op
BenchmarkIcebergQuery.testOptimize               400 * 4  avgt   10  1025.587 ±  78.809  ms/op

While the benchmark test result after this change is as follows:


----After this change, lazy load partitions and check the max threshold in loading phase----

Benchmark                                  (recordCount)  Mode  Cnt    Score    Error  Units
BenchmarkIcebergQuery.testFurtherOptimize        300 * 3  avgt   10  164.981 ±  3.552  ms/op
BenchmarkIcebergQuery.testFurtherOptimize        400 * 4  avgt   10  211.434 ±  4.327  ms/op
BenchmarkIcebergQuery.testNormalQuery            300 * 3  avgt   10  437.028 ± 49.367  ms/op
BenchmarkIcebergQuery.testNormalQuery            400 * 4  avgt   10  634.008 ± 54.919  ms/op
BenchmarkIcebergQuery.testOptimize               300 * 3  avgt   10  257.394 ±  7.223  ms/op
BenchmarkIcebergQuery.testOptimize               400 * 4  avgt   10  808.018 ± 36.952  ms/op

Due to the issues mentioned above, we found that this change significantly improves the performance of queries that are not suitable for metadata optimization, while for statements that can be optimized based on metadata, the performance also improves when the number of partitions exceeds the threshold. This is in line with expectations, and it can be anticipated that as the number of partitions increases, the performance improvement will further increase.

Motivation and Context

Make partitions loading for iceberg table lazy to avoid unnecessary loading

Impact

N/A

Test Plan

Make sure the change do not affect existing tests
Newly added test case in TestIcebergLogicalPlanner to show the behaviors with different max partition thresholds
Benchmark tests in local to make sure the improvement in performance

Contributor checklist

Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
Documented new properties (with its default value), SQL syntax, functions, or other functionality.
If release notes are required, they follow the release notes guidelines.
Adequate tests were added if applicable.
CI passed.

Release Notes

== RELEASE NOTES ==


Iceberg Connector Changes
* Improve partitions loading by making them lazy for Iceberg tables to avoid unnecessary loading.

ZacBlanco

I understand the motivation for this, but I am a little concerned about the other parts which implement the limiting of the partition loading. It feels a little convoluted, especially requiring checking the instance type of the iterable to set the max number of partitions to iterate. I feel like it could be simplified by just creating an iterable implementation which accepts the limit in the first place, rather than updating it after the fact. I may be missing something in my understanding if that isn't a feasible solution

The current implementation of partition loading in IcebergPartitionLoader looks good though.

ZacBlanco · 2024-09-27T16:12:09Z

presto-hive-common/src/main/java/com/facebook/presto/hive/LazyLoadedPartitions.java

+
+import static java.util.Objects.requireNonNull;
+
+public class LazyLoadedPartitions


I feel like this class itself should just implement Iterable. Is there a reason we need both LazyLoadedPartitions and LazyPartitionsIterable?

Good suggestion, after recheck the code I think it's feasible and reasonable. Fixed!

ZacBlanco · 2024-09-30T20:26:19Z

presto-spi/src/main/java/com/facebook/presto/spi/LazyIterable.java

+ */
+package com.facebook.presto.spi;
+
+public interface LazyIterable<T>


I feel like interface makes the implementation more complex. Is there a reason we need to have the max iteration count mutable? I feel like having this mutable iteration count is a bit of an anti-pattern. What if instead we use something like limit from the stream API to prevent loading too many items? e.g.

Stream.of(new Iterable<>(...){}).limit(<N>)...

Another option could be adding a new field inside of the DiscretePredicate class with the maximum number of predicates to iterate over? Then we don't have to rely on the casting + lazy iterable.

Another alternative could be putting a limit on the iterator we pass to the DiscretePredicates constructor too, rather than even having to modify the DiscretePredicate class.

Thanks for the review and feedback, the entire thinking is described as follows, please let me know if there is anything unreasonable.

There are several points to consider when implementing setting max threshold for partitions lazy loading:

If any delete files are found during the file scanning process, the scanning and loading of partition values would be no longer continue, but instead a single value representing unpartitioned should be returned. So we cannot just scan a part of files to get a specified number of available partition values. So it seems that we cannot set the max threshold and terminate the scan in advance through methods such like Stream.of(new Iterable<>(...){}).limit(<N>)....

The max threshold can be passed into IcebergPartitionLoader and used there during the loading to terminate the entire scanning process. Because during the loading process, even if no delete files are encountered, when the number of loaded available partitions exceeds the max threshold, we can terminate directly and return a value representing unpartitioned. So we have to figure out a way to pass max threshold into IcebergPartitionLoader.

The construction of the table layout, that is, the lazy partitions and DiscretePredicates information is built earlier than the actual loading. Meanwhile, the value of max threshold can only be determined when actual loading is required. For example, in IcebergMetadataOptimizer, if further reducible optimization is available, we do not limit the max number of partitions loaded; otherwise, the max number of loaded partitions is set 1000 by default. So it seems that we need to support this delay setting of maximum threshold for the partition loader.

LazyLoadedPartitions and PartitionLoader are both defined in presto-hive-common, while DiscretePredicates is defined in presto-spi, which do not depend on presto-hive-common. As mentioned above, we need a way to set the max threshold into PartitionLoader through DiscretePredicates before actual loading. So we define an interface in presto-spi and let LazyLoadedPartitions in presto-hive-common to implement it, in this way, max threshold can be finally passed into PartitionLoader through LazyLoadedPartitons.

presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergPartitionLoader.java

steveburnett · 2024-11-04T15:19:28Z

Thanks for the release note entry! Minor suggestion to help follow the Order of changes in the Release Note Guidelines.

== RELEASE NOTES ==

Iceberg Connector Changes
* Improve partitions loading by making them lazy for Iceberg tables to avoid unnecessary loading. :pr:`23645`

hantangwangd · 2024-11-04T16:50:51Z

@steveburnett Thanks for your suggestion, fixed! Please take a look when convenient.

steveburnett · 2024-11-04T17:22:27Z

@steveburnett Thanks for your suggestion, fixed! Please take a look when convenient.

@hantangwangd LGTM, thanks!

steveburnett · 2025-02-17T14:33:10Z

Hi, new release note guidelines. Please remove the manual PR link in the following format from the release note entries for this PR.


:pr:`12345`

I have updated the Release Notes Guidelines to remove the examples of manually adding the PR link.

hantangwangd · 2025-02-17T14:45:40Z

Hi, new release note guidelines. Please remove the manual PR link in the following format from the release note entries for this PR.

Oh sorry forgot that. Thanks for your reminder @steveburnett, fixed!

hantangwangd force-pushed the make_partition_loading_lazy branch from 9028384 to e390db7 Compare September 14, 2024 12:11

hantangwangd marked this pull request as ready for review September 15, 2024 03:31

hantangwangd requested review from ZacBlanco and a team as code owners September 15, 2024 03:31

hantangwangd requested review from presto-oss, imjalpreet, aaneja and tdcmeehan and removed request for presto-oss September 15, 2024 03:31

hantangwangd changed the title ~~[WIP]Make partitions loading for iceberg table lazy to avoid unnecessary loading~~ Make partitions loading for iceberg table lazy to avoid unnecessary loading Sep 15, 2024

ZacBlanco requested changes Sep 30, 2024

View reviewed changes

hantangwangd force-pushed the make_partition_loading_lazy branch from e390db7 to 827feb5 Compare October 1, 2024 02:38

hantangwangd force-pushed the make_partition_loading_lazy branch from 827feb5 to 819c643 Compare November 3, 2024 12:42

hantangwangd requested a review from ZacBlanco November 3, 2024 13:48

tdcmeehan self-assigned this Nov 4, 2024

hantangwangd added 2 commits February 15, 2025 08:14

[Iceberg]Make partitions loading lazy to avoid unnecessary loading

fdc5f02

[Iceberg]Make the max threshold of loaded partitions configurable

ce6b43f

hantangwangd force-pushed the make_partition_loading_lazy branch from 819c643 to ce6b43f Compare February 15, 2025 00:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make partitions loading for iceberg table lazy to avoid unnecessary loading #23645

Make partitions loading for iceberg table lazy to avoid unnecessary loading #23645

hantangwangd commented Sep 13, 2024 •

edited

Loading

ZacBlanco left a comment

ZacBlanco Sep 27, 2024

hantangwangd Oct 1, 2024

ZacBlanco Sep 30, 2024

ZacBlanco Sep 30, 2024

hantangwangd Oct 1, 2024

steveburnett commented Nov 4, 2024

hantangwangd commented Nov 4, 2024

steveburnett commented Nov 4, 2024

steveburnett commented Feb 17, 2025

hantangwangd commented Feb 17, 2025


		import static java.util.Objects.requireNonNull;

		public class LazyLoadedPartitions

Make partitions loading for iceberg table lazy to avoid unnecessary loading #23645

Are you sure you want to change the base?

Make partitions loading for iceberg table lazy to avoid unnecessary loading #23645

Conversation

hantangwangd commented Sep 13, 2024 • edited Loading

Description

Motivation and Context

Impact

Test Plan

Contributor checklist

Release Notes

ZacBlanco left a comment

Choose a reason for hiding this comment

ZacBlanco Sep 27, 2024

Choose a reason for hiding this comment

hantangwangd Oct 1, 2024

Choose a reason for hiding this comment

ZacBlanco Sep 30, 2024

Choose a reason for hiding this comment

ZacBlanco Sep 30, 2024

Choose a reason for hiding this comment

hantangwangd Oct 1, 2024

Choose a reason for hiding this comment

steveburnett commented Nov 4, 2024

hantangwangd commented Nov 4, 2024

steveburnett commented Nov 4, 2024

steveburnett commented Feb 17, 2025

hantangwangd commented Feb 17, 2025

hantangwangd commented Sep 13, 2024 •

edited

Loading