Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TPC-DS SF-1K Q72 extremely slow in Iceberg #22148

Closed
ethanyzhang opened this issue Mar 11, 2024 · 10 comments
Closed

TPC-DS SF-1K Q72 extremely slow in Iceberg #22148

ethanyzhang opened this issue Mar 11, 2024 · 10 comments
Assignees
Labels

Comments

@ethanyzhang
Copy link
Contributor

ethanyzhang commented Mar 11, 2024

On Java clusters

Java Run Comparision Dashboard, Iceberg vs. Hive
image

It took 1.04 hours to finish on the Iceberg schema and only 2 mins on the Hive schema.
Query Comparison Dashboard, Iceberg vs. Hive

On Prestissimo clusters

Prestissimo Run Comparision Dashboard, Iceberg vs. Hive
image
We killed the query on Prestissimo Iceberg schema after 6 hours. On Prestissimo Hive it only took 1.31 mins.
Query Comparison Dashboard, Iceberg vs. Hive

@github-project-automation github-project-automation bot moved this to 🆕 Unprioritized in Bugs and support requests Mar 11, 2024
@ethanyzhang ethanyzhang changed the title TPC-DS Q72 extremely slow in Iceberg TPC-DS SF-1K Q72 extremely slow in Iceberg Mar 11, 2024
@hantangwangd
Copy link
Member

hantangwangd commented Mar 11, 2024

In my experiments, q72's running time would be greatly shorten after executing ANALYZE <tableName> for it's involving tables, thanks for the PR #20993 risen by @ZacBlanco. For the NDV value is crucial when selecting the best join tree.

@yingsu00
Copy link
Contributor

yingsu00 commented Mar 11, 2024

The Iceberg plan is very different than the Hive one. The Hive one was a bushy tree, while the Iceberg one is a left deep tree. This may result in the way the stats is used is different?
Hive Plan:
Screenshot 2024-03-11 at 15 38 04
Iceberg Plan
Screenshot 2024-03-11 at 15 37 45

@yzhang1991 Are you sure the Iceberg tables stats were collected?

@ethanyzhang
Copy link
Contributor Author

ethanyzhang commented Mar 11, 2024

@yingsu00 Yes, but I found the stats we got from analyzing the hive tables and the iceberg tables are different. See the two screenshots below showing the stats for the same table in the same scale factor but one in the hive catalog one in the iceberg catalog:
image
image
In hive table stats, there is not any NULL value in the distinct_values_count column., some NULLs in the data_size column.
However, in the Iceberg table stats, the distinct_values_count column is completely NULL. But there is no NULL value in the data_size column.
@ZacBlanco I thought your NDV work would add the distinct value count, maybe I was confused?

@aaneja
Copy link
Contributor

aaneja commented Mar 11, 2024

I think the issue here is with the setup - the iceberg connector employs a iceberg.catalog.type of HIVE.

For this, the relevant code path for statistics load, merges the stats from HMS and Iceberg using the iceberg.hive_statistics_merge_strategy session property (or iceberg.hive-statistics-merge-strategy feature config). The default for this is NONE

By changing this property to USE_NULLS_FRACTION_AND_NDV and then running ANALYZE, I can see that we do get non-zero NDV counts in show stats -

presto:tpcds_sf1000_parquet_iceberg> show stats for reason;
  column_name  | data_size | distinct_values_count | nulls_fraction | row_count | low_value | high_value 
---------------+-----------+-----------------------+----------------+-----------+-----------+------------
 r_reason_sk   |     148.0 | NULL                  |            0.0 | NULL      | 1         | 65         
 r_reason_id   |     147.0 | NULL                  |            0.0 | NULL      | NULL      | NULL       
 r_reason_desc |     466.0 | NULL                  |            0.0 | NULL      | NULL      | NULL       
 NULL          | NULL      | NULL                  | NULL           |      65.0 | NULL      | NULL       
(4 rows)
presto:tpcds_sf1000_parquet_iceberg> set session iceberg.hive_statistics_merge_strategy='USE_NULLS_FRACTION_AND_NDV';
SET SESSION
presto:tpcds_sf1000_parquet_iceberg> analyze reason;
ANALYZE: 65 rows

Query 20240311_112336_00044_ivzjq, FINISHED, 3 nodes
Splits: 19 total, 19 done (100.00%)
[Latency: client-side: 0:02, server-side: 0:01] [65 rows, 2.36KB] [63 rows/s, 2.32KB/s]

presto:tpcds_sf1000_parquet_iceberg> show stats for reason;
  column_name  | data_size | distinct_values_count | nulls_fraction | row_count | low_value | high_value 
---------------+-----------+-----------------------+----------------+-----------+-----------+------------
 r_reason_sk   |     148.0 |                  65.0 |            0.0 | NULL      | 1         | 65         
 r_reason_id   |     147.0 |                  65.0 |            0.0 | NULL      | NULL      | NULL       
 r_reason_desc |     466.0 |                  64.0 |            0.0 | NULL      | NULL      | NULL       
 NULL          | NULL      | NULL                  | NULL           |      65.0 | NULL      | NULL       
(4 rows)

We will need to do this (set flag, re-analyze) for all tables in the schema

@ZacBlanco
Copy link
Contributor

ZacBlanco commented Mar 11, 2024

Re: support for NDVs in Iceberg+Hive. By default the ANALYZE behavior will collect all of the hive-supported statistics and store them into the Hive metastore. This should include NDVs and null counts.

When reading the statistics from an Iceberg table that uses Hive, statistics can come from the table itself or from the Hive metastore. By default, we only use statistics from the Iceberg table when reading because values like min/max and null counts are supposed to be more accurate as the manifests should always accurately reflect the values. However, NDVs and null counts are not always necessarily the as accurate from the iceberg table, so we provide the option to override the stats coming from the iceberg table with stats from the HMS. To override the behavior when retrieving stats you can set the property connector session property iceberg.hive_statistics_merge_strategy.

You can set it to the following values:

  1. NONE (default): Doesn't use any stats from the HMS
  2. USE_NDV: overwrites NDV stats from iceberg table with those from the HMS
  3. USE_NULLS_FRACTIONS: overwrites nulls fraction stats from iceberg table with those from the HMS
  4. USE_NULLS_FRACTION_AND_NDV: overwrites NDV and nulls fraction stats from iceberg table with those from the HMS.

Since the HMS should already have the stats from the analyze before, you shouldn't need to re-analyze the tables.

@ZacBlanco
Copy link
Contributor

ZacBlanco commented Mar 11, 2024

Adding on to this investigation. I confirmed that after analyzing on the Iceberg table and running SET SESSION iceberg.hive_statistics_merge_strategy = 'USE_NDV' that the query executes successfully in about ~45s and has a plan shape similar to the hive one @yingsu00 posted above.

It seems when this issue was filed the statistics for the iceberg versions of the tables were also not yet in the HMS. So ANALYZE was probably not run on these tables.

Query Plan Image

Plan image from the coordinator UI to show plan shape is the same as hive
image

Query JSON

@tdcmeehan
Copy link
Contributor

@ZacBlanco quick unrelated question: when would null fractions ever be inaccurate when they come from Iceberg? From my understanding of the spec, manifest files store the absolute count of nulls, so shouldn't the fraction be mergeable and hence completely accurate solely from the manifest file information?

@ethanyzhang
Copy link
Contributor Author

Confirmed that this issue goes away after applying the session property in run c1w0_native_oss_sf1k_ice_ds_power_240311-214304 Closing this issue now.

@github-project-automation github-project-automation bot moved this from 🆕 Unprioritized to ✅ Done in Bugs and support requests Mar 11, 2024
@aaneja
Copy link
Contributor

aaneja commented Mar 12, 2024

It seems when this issue was filed the statistics for the iceberg versions of the tables were also not yet in the HMS. So ANALYZE was probably not run on these tables.

@ZacBlanco Can we test this hypothesis? I recall I analyzed a table, set the right feature flag, but could not get the NDVs to show up without doing a re-analyze. If we can add this as a integration test, that would be ideal

@ZacBlanco
Copy link
Contributor

@aaneja See #22162

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

No branches or pull requests

9 participants