TPC-DS SF-1K Q72 extremely slow in Iceberg #22148

ethanyzhang · 2024-03-11T03:56:03Z

On Java clusters

Java Run Comparision Dashboard, Iceberg vs. Hive

It took 1.04 hours to finish on the Iceberg schema and only 2 mins on the Hive schema.
Query Comparison Dashboard, Iceberg vs. Hive

On Prestissimo clusters

Prestissimo Run Comparision Dashboard, Iceberg vs. Hive

We killed the query on Prestissimo Iceberg schema after 6 hours. On Prestissimo Hive it only took 1.31 mins.
Query Comparison Dashboard, Iceberg vs. Hive

hantangwangd · 2024-03-11T05:26:28Z

In my experiments, q72's running time would be greatly shorten after executing ANALYZE <tableName> for it's involving tables, thanks for the PR #20993 risen by @ZacBlanco. For the NDV value is crucial when selecting the best join tree.

yingsu00 · 2024-03-11T07:39:35Z

The Iceberg plan is very different than the Hive one. The Hive one was a bushy tree, while the Iceberg one is a left deep tree. This may result in the way the stats is used is different?
Hive Plan:

Iceberg Plan

@yzhang1991 Are you sure the Iceberg tables stats were collected?

ethanyzhang · 2024-03-11T08:17:17Z

@yingsu00 Yes, but I found the stats we got from analyzing the hive tables and the iceberg tables are different. See the two screenshots below showing the stats for the same table in the same scale factor but one in the hive catalog one in the iceberg catalog:

In hive table stats, there is not any NULL value in the distinct_values_count column., some NULLs in the data_size column.
However, in the Iceberg table stats, the distinct_values_count column is completely NULL. But there is no NULL value in the data_size column.
@ZacBlanco I thought your NDV work would add the distinct value count, maybe I was confused?

aaneja · 2024-03-11T11:25:04Z

I think the issue here is with the setup - the iceberg connector employs a iceberg.catalog.type of HIVE.

For this, the relevant code path for statistics load, merges the stats from HMS and Iceberg using the iceberg.hive_statistics_merge_strategy session property (or iceberg.hive-statistics-merge-strategy feature config). The default for this is NONE

By changing this property to USE_NULLS_FRACTION_AND_NDV and then running ANALYZE, I can see that we do get non-zero NDV counts in show stats -

presto:tpcds_sf1000_parquet_iceberg> show stats for reason;
  column_name  | data_size | distinct_values_count | nulls_fraction | row_count | low_value | high_value 
---------------+-----------+-----------------------+----------------+-----------+-----------+------------
 r_reason_sk   |     148.0 | NULL                  |            0.0 | NULL      | 1         | 65         
 r_reason_id   |     147.0 | NULL                  |            0.0 | NULL      | NULL      | NULL       
 r_reason_desc |     466.0 | NULL                  |            0.0 | NULL      | NULL      | NULL       
 NULL          | NULL      | NULL                  | NULL           |      65.0 | NULL      | NULL       
(4 rows)
presto:tpcds_sf1000_parquet_iceberg> set session iceberg.hive_statistics_merge_strategy='USE_NULLS_FRACTION_AND_NDV';
SET SESSION
presto:tpcds_sf1000_parquet_iceberg> analyze reason;
ANALYZE: 65 rows

Query 20240311_112336_00044_ivzjq, FINISHED, 3 nodes
Splits: 19 total, 19 done (100.00%)
[Latency: client-side: 0:02, server-side: 0:01] [65 rows, 2.36KB] [63 rows/s, 2.32KB/s]

presto:tpcds_sf1000_parquet_iceberg> show stats for reason;
  column_name  | data_size | distinct_values_count | nulls_fraction | row_count | low_value | high_value 
---------------+-----------+-----------------------+----------------+-----------+-----------+------------
 r_reason_sk   |     148.0 |                  65.0 |            0.0 | NULL      | 1         | 65         
 r_reason_id   |     147.0 |                  65.0 |            0.0 | NULL      | NULL      | NULL       
 r_reason_desc |     466.0 |                  64.0 |            0.0 | NULL      | NULL      | NULL       
 NULL          | NULL      | NULL                  | NULL           |      65.0 | NULL      | NULL       
(4 rows)

We will need to do this (set flag, re-analyze) for all tables in the schema

ZacBlanco · 2024-03-11T16:06:14Z

Re: support for NDVs in Iceberg+Hive. By default the ANALYZE behavior will collect all of the hive-supported statistics and store them into the Hive metastore. This should include NDVs and null counts.

When reading the statistics from an Iceberg table that uses Hive, statistics can come from the table itself or from the Hive metastore. By default, we only use statistics from the Iceberg table when reading because values like min/max and null counts are supposed to be more accurate as the manifests should always accurately reflect the values. However, NDVs and null counts are not always necessarily the as accurate from the iceberg table, so we provide the option to override the stats coming from the iceberg table with stats from the HMS. To override the behavior when retrieving stats you can set the property connector session property iceberg.hive_statistics_merge_strategy.

You can set it to the following values:

NONE (default): Doesn't use any stats from the HMS
USE_NDV: overwrites NDV stats from iceberg table with those from the HMS
USE_NULLS_FRACTIONS: overwrites nulls fraction stats from iceberg table with those from the HMS
USE_NULLS_FRACTION_AND_NDV: overwrites NDV and nulls fraction stats from iceberg table with those from the HMS.

Since the HMS should already have the stats from the analyze before, you shouldn't need to re-analyze the tables.

ZacBlanco · 2024-03-11T19:09:08Z

Adding on to this investigation. I confirmed that after analyzing on the Iceberg table and running SET SESSION iceberg.hive_statistics_merge_strategy = 'USE_NDV' that the query executes successfully in about ~45s and has a plan shape similar to the hive one @yingsu00 posted above.

It seems when this issue was filed the statistics for the iceberg versions of the tables were also not yet in the HMS. So ANALYZE was probably not run on these tables.

Query Plan Image

Plan image from the coordinator UI to show plan shape is the same as hive

Query JSON

tdcmeehan · 2024-03-11T19:26:39Z

@ZacBlanco quick unrelated question: when would null fractions ever be inaccurate when they come from Iceberg? From my understanding of the spec, manifest files store the absolute count of nulls, so shouldn't the fraction be mergeable and hence completely accurate solely from the manifest file information?

ethanyzhang · 2024-03-11T23:07:58Z

Confirmed that this issue goes away after applying the session property in run c1w0_native_oss_sf1k_ice_ds_power_240311-214304 Closing this issue now.

aaneja · 2024-03-12T02:22:58Z

It seems when this issue was filed the statistics for the iceberg versions of the tables were also not yet in the HMS. So ANALYZE was probably not run on these tables.

@ZacBlanco Can we test this hypothesis? I recall I analyzed a table, set the right feature flag, but could not get the NDVs to show up without doing a re-analyze. If we can add this as a integration test, that would be ideal

ZacBlanco · 2024-03-12T05:46:41Z

@aaneja See #22162

ethanyzhang added the bug label Mar 11, 2024

ethanyzhang assigned aaneja, imjalpreet, agrawalreetika, pratyakshsharma and yingsu00 Mar 11, 2024

github-project-automation bot added this to Bugs and support requests Mar 11, 2024

github-project-automation bot moved this to 🆕 Unprioritized in Bugs and support requests Mar 11, 2024

ethanyzhang changed the title ~~TPC-DS Q72 extremely slow in Iceberg~~ TPC-DS SF-1K Q72 extremely slow in Iceberg Mar 11, 2024

ethanyzhang closed this as completed Mar 11, 2024

github-project-automation bot moved this from 🆕 Unprioritized to ✅ Done in Bugs and support requests Mar 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TPC-DS SF-1K Q72 extremely slow in Iceberg #22148

TPC-DS SF-1K Q72 extremely slow in Iceberg #22148

ethanyzhang commented Mar 11, 2024 •

edited

Loading

hantangwangd commented Mar 11, 2024 •

edited

Loading

yingsu00 commented Mar 11, 2024 •

edited

Loading

ethanyzhang commented Mar 11, 2024 •

edited

Loading

aaneja commented Mar 11, 2024

ZacBlanco commented Mar 11, 2024 •

edited

Loading

ZacBlanco commented Mar 11, 2024 •

edited

Loading

tdcmeehan commented Mar 11, 2024

ethanyzhang commented Mar 11, 2024

aaneja commented Mar 12, 2024

ZacBlanco commented Mar 12, 2024

TPC-DS SF-1K Q72 extremely slow in Iceberg #22148

TPC-DS SF-1K Q72 extremely slow in Iceberg #22148

Comments

ethanyzhang commented Mar 11, 2024 • edited Loading

On Java clusters

On Prestissimo clusters

hantangwangd commented Mar 11, 2024 • edited Loading

yingsu00 commented Mar 11, 2024 • edited Loading

ethanyzhang commented Mar 11, 2024 • edited Loading

aaneja commented Mar 11, 2024

ZacBlanco commented Mar 11, 2024 • edited Loading

ZacBlanco commented Mar 11, 2024 • edited Loading

tdcmeehan commented Mar 11, 2024

ethanyzhang commented Mar 11, 2024

aaneja commented Mar 12, 2024

ZacBlanco commented Mar 12, 2024

ethanyzhang commented Mar 11, 2024 •

edited

Loading

hantangwangd commented Mar 11, 2024 •

edited

Loading

yingsu00 commented Mar 11, 2024 •

edited

Loading

ethanyzhang commented Mar 11, 2024 •

edited

Loading

ZacBlanco commented Mar 11, 2024 •

edited

Loading

ZacBlanco commented Mar 11, 2024 •

edited

Loading