Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TaskCountStatsMonitor to monitor task count stats #6657

Merged
merged 4 commits into from
Dec 4, 2018

Conversation

QiuMM
Copy link
Member

@QiuMM QiuMM commented Nov 23, 2018

Have tested in real cluster and now running in production.

Copy link
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@QiuMM thanks for the PR. I left some comments.

@@ -100,6 +103,11 @@

private static final EmittingLogger log = new EmittingLogger(TaskQueue.class);

private final Map<String, AtomicLong> totalSuccessfulTaskCount = new ConcurrentHashMap<>();
private final Map<String, AtomicLong> totalFailedTaskCount = new ConcurrentHashMap<>();
private Map<String, Long> prevSuccessfulTaskCount = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about prevTotalSuccessfulTaskCount? Same for prevFailedTaskCount.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


if (status.isSuccess()) {
totalSuccessfulTaskCount.computeIfAbsent(task.getDataSource(), k -> new AtomicLong());
totalSuccessfulTaskCount.get(task.getDataSource()).incrementAndGet();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

computeIfAbsent() returns the proper instance of AtomicLong, so you can just call incrementAndGet() directly on that instance like below. This is better because we can avoid unnecessary get().

                  totalSuccessfulTaskCount.computeIfAbsent(task.getDataSource(), k -> new AtomicLong())
                                          .incrementAndGet();

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, done.

totalSuccessfulTaskCount.get(task.getDataSource()).incrementAndGet();
} else {
totalFailedTaskCount.computeIfAbsent(task.getDataSource(), k -> new AtomicLong());
totalFailedTaskCount.get(task.getDataSource()).incrementAndGet();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, done.

import java.util.concurrent.atomic.AtomicReference;
import java.util.concurrent.locks.ReentrantLock;

/**
* Encapsulates the indexer leadership lifecycle.
*/
public class TaskMaster
public class TaskMaster implements TaskCountStatsProvider
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does TaskQueue also need to be TaskCountStatsProvider?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, TaskCountStatsProvider is bind to TaskMaster, there is no need to let TaskQueue be TaskCountStatsProvider.

|`task/success/count`|Number of successful tasks per emission period. This metric is only available if the TaskCountStatsMonitor module is included.|dataSource.|Varies.|
|`task/failed/count`|Number of failed tasks per emission period. This metric is only available if the TaskCountStatsMonitor module is included.|dataSource.|Varies.|
|`task/running/count`|Number of current running tasks. This metric is only available if the TaskCountStatsMonitor module is included.|dataSource.|Varies.|
|`task/pending/count`|Number of current pending tasks. This metric is only available if the TaskCountStatsMonitor module is included.|dataSource.|Varies.|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also add a metric for waiting tasks? It will be useful too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, done.

@@ -586,4 +602,36 @@ private void syncFromStorage()
return rv;
}

private Map<String, Long> getDeltaValues(Map<String, Long> total, Map<String, Long> prev)
{
return total.entrySet().stream().collect(Collectors.toMap(Map.Entry::getKey, e -> e.getValue() - prev.getOrDefault(e.getKey(), 0L)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the line width. The max line is 120 (https://github.com/apache/incubator-druid/blob/master/eclipse_formatting.xml#L96). Same for the below methods.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, done.


import java.util.Map;

public class TaskCountStatsMonitor extends AbstractMonitor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please add a unit test? I think it should verify the values emitted from this monitor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @QiuMM. Please consider my last comment.

@Override
public Map<String, Long> getSuccessfulTaskCount()
{
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fill all metrics.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @QiuMM!

Copy link
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The new metrics look useful and they are pretty low footprint, once every monitor period.

@gianm gianm merged commit 6073390 into apache:master Dec 4, 2018
@jon-wei jon-wei added this to the 0.14.0 milestone Feb 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants