-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve latency for HBO optimizer #21297
Improve latency for HBO optimizer #21297
Conversation
947f845
to
ed5640c
Compare
if (enableVerboseRuntimeStats) { | ||
profileStartTime = System.nanoTime(); | ||
} | ||
planNodesWithHash.addAll(getPlanNodeHashes(plan, session, false).values()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cache only is set to false, as this is during registration of plan, hence we are supposed to have nothing in cache and populate the cache for future use.
Optional<String> hash = planCanonicalInfoProvider.hash(session, statsEquivalentPlanNode, strategy); | ||
allHashesBuilder.put(strategy, new PlanNodeWithHash(statsEquivalentPlanNode, hash)); | ||
Optional<String> hash = planCanonicalInfoProvider.hash(session, statsEquivalentPlanNode, strategy, cacheOnly); | ||
if (hash.isPresent()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only return hash when it's present, the returned hash can be empty if the "planCanonicalInfoProvider.hash" timeout.
@@ -147,7 +165,7 @@ private PlanNodeStatsEstimate getStatistics(PlanNode planNode, Session session, | |||
} | |||
|
|||
PlanNode plan = resolveGroupReferences(planNode, lookup); | |||
Map<PlanCanonicalizationStrategy, PlanNodeWithHash> allHashes = getPlanNodeHashes(plan, session); | |||
Map<PlanCanonicalizationStrategy, PlanNodeWithHash> allHashes = getPlanNodeHashes(plan, session, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is within "getStatistics" function which is called when reading estimates from HBO, and we set cacheOnly
to true here, as we are supposed to already have cache here, and we do not want to calculate on the fly if cache miss during cost based planning.
@@ -163,7 +181,7 @@ private PlanNodeStatsEstimate getStatistics(PlanNode planNode, Session session, | |||
for (PlanCanonicalizationStrategy strategy : historyBasedPlanCanonicalizationStrategyList()) { | |||
for (Map.Entry<PlanNodeWithHash, HistoricalPlanStatistics> entry : statistics.entrySet()) { | |||
if (allHashes.containsKey(strategy) && entry.getKey().getHash().isPresent() && allHashes.get(strategy).equals(entry.getKey())) { | |||
Optional<List<PlanStatistics>> inputTableStatistics = getPlanNodeInputTableStatistics(plan, session); | |||
Optional<List<PlanStatistics>> inputTableStatistics = getPlanNodeInputTableStatistics(plan, session, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly set to cacheOnly
to true as it's within getStatistics
function.
Map<CacheKey, PlanNodeCanonicalInfo> cache = historyBasedStatisticsCacheManager.getCanonicalInfoCache(session.getQueryId()); | ||
PlanNodeCanonicalInfo result = cache.get(key); | ||
if (result != null) { | ||
return Optional.of(result); | ||
if (result != null || cacheOnly) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If cacheOnly
set to true, return whatever got from cache. This field will be false
during registration call in HBO optimizer, so that all data is calculated and cached during registration, and later usage will only read form cache.
profileTime("CanonicalPlanGenerator", profileStartTime, session); | ||
} | ||
if (loadValueTimeout(startTimeInNano, timeoutInMilliseconds)) { | ||
return Optional.empty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return early if timeout
Optional<String> hash = planCanonicalInfoProvider.hash(session, statsEquivalentPlanNode, strategy); | ||
Optional<List<PlanStatistics>> inputTableStatistics = planCanonicalInfoProvider.getInputTableStatistics(session, statsEquivalentPlanNode); | ||
Optional<String> hash = planCanonicalInfoProvider.hash(session, statsEquivalentPlanNode, strategy, true); | ||
Optional<List<PlanStatistics>> inputTableStatistics = planCanonicalInfoProvider.getInputTableStatistics(session, statsEquivalentPlanNode, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is called at the end to collect the HBO information, hence cacheOnly
set to true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice finds!
Please improve the release note message (see guidelines). Suggestion:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % a nit
presto-main/src/main/java/com/facebook/presto/cost/HistoryBasedPlanStatisticsCalculator.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/cost/HistoryBasedPlanStatisticsCalculator.java
Outdated
Show resolved
Hide resolved
ed5640c
to
797b59e
Compare
1. Check timeout for plan canonicalization, plan hash and return empty when timeout 2. Add a flag to specify whether to fetch HBO related data from cache only. It's false when register the query, and true when getting statistics 3. Add verbose run time stats for plan hash, plan canonicalization and meta data read
797b59e
to
6e30e08
Compare
Description
Part of #20355
This PR optimizes the latency for HBO optimizer from the following aspects:
Previously it only checks timeout for meta data access, and do not return immediately when timeout. However, we found that plan canonicalization and hash can also be time consuming. And we need to check for these two, and also return immediately when timeout.
In the design of HBO, all HBO related stats are fetched and cached during query registration in
HistoricalStatisticsEquivalentPlanMarkingOptimizer
optimizer, and following fetch of stats will access from cache. This PR adds a new parametercacheOnly
to corresponding methods to enforce this behavior.This is only enabled when verbose runtime flag is set to true, and will help to debug HBO timeout.
Motivation and Context
This is to reduce the latency of HBO optimizer.
Impact
Tested with a few production queries, see significant drop of latency for HBO optimizer.
Test Plan
Add unit tests. Also tested locally end to end to verify that HBO optimizer latency decreased. In tests, queries which takes minutes to have HBO optimizer timeout now timeouts within specified timeout limit, i.e. 20seconds.
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.