Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CCR] Introduce leader index name & last fetch time stats to stats api response #33155

Merged
merged 9 commits into from
Aug 29, 2018

Conversation

martijnvg
Copy link
Member

No description provided.

@martijnvg martijnvg added review :Distributed Indexing/CCR Issues around the Cross Cluster State Replication features labels Aug 27, 2018
@martijnvg martijnvg requested a review from jasontedor August 27, 2018 04:02
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment about using the relative time provider, so we would report how long it has been since the most recent fetch. In general, I think this changes like this would be better as two separate PRs, these are separate changes.

private final Queue<Translog.Operation> buffer = new PriorityQueue<>(Comparator.comparing(Translog.Operation::seqNo));
private final LinkedHashMap<Long, ElasticsearchException> fetchExceptions;

ShardFollowNodeTask(long id, String type, String action, String description, TaskId parentTask, Map<String, String> headers,
ShardFollowTask params, BiConsumer<TimeValue, Runnable> scheduler, final LongSupplier relativeTimeProvider) {
ShardFollowTask params, BiConsumer<TimeValue, Runnable> scheduler, final LongSupplier relativeTimeProvider,
LongSupplier currentTimeSupplier) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use the relativeTimeSupplier and merely report the time (elapsed) since the last fetch.

@@ -89,6 +89,7 @@ protected void assertEqualInstances(final ShardFollowNodeTask.Status expectedIns
anyOf(instanceOf(ElasticsearchException.class), instanceOf(IllegalStateException.class)));
assertThat(entry.getValue().getCause().getMessage(), containsString(expected.getCause().getMessage()));
}
assertThat(newInstance.timeSinceLastFetch(), equalTo(expectedInstance.timeSinceLastFetch()));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦‍♂️

@martijnvg
Copy link
Member Author

@jasontedor I've updated the PR to use report time since last fetch instead of the timestamp the last fetch occurred. Next time I will split a change like this into 2 PRs.

because that would be confusing for the consumer of the stats api.
Copy link
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looking good, I left a comment.

timeSinceLastFetchMillis = TimeUnit.NANOSECONDS.toMillis(relativeTimeProvider.getAsLong() - lastFetchTime);
} else {
// To avoid confusion when ccr didn't yet execute a fetch:
timeSinceLastFetchMillis = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it should be 0, I think it should be -1 so that we can distinguish a fetch just happened less than 1ms ago from a fetch just happened.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means a serialization change too, to support -1.

@martijnvg
Copy link
Member Author

@jasontedor I've addressed that comment.

Copy link
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one important comment, the rest is cosmetic. We will get this in on the next iteration.

@@ -412,7 +416,21 @@ public ShardId getFollowShardId() {

@Override
public synchronized Status getStatus() {
final String leaderIndex;
if (params.getLeaderClusterAlias() != null) {
leaderIndex = params.getLeaderClusterAlias() + ":" + params.getLeaderShardId().getIndexName();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should avoid re-computing this value every time we get status (think of a monitoring system polling the stats every second). We are creating unnecessary garbage on every poll for every shard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, that's a calculation done under a lock, a lock that synchronizes a lot of other methods in this class.

@@ -648,7 +679,14 @@ public long numberOfOperationsIndexed() {
return fetchExceptions;
}

private final long timeSinceLastFetchMillis;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for keeping these in the same order as the constructor parameters, etc. 🙏

Status(
String leaderIndex,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be final.

@@ -648,7 +679,14 @@ public long numberOfOperationsIndexed() {
return fetchExceptions;
}

private final long timeSinceLastFetchMillis;

public long timeSinceLastFetch() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add Millis to the name.

@martijnvg
Copy link
Member Author

Thanks @jasontedor. I've updated the PR.

Copy link
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

* ccr:
  Fix ShardFollowNodeTask.Status equals and hash code (elastic#33189)
  Make soft-deletes settings final (elastic#33172)
  Only fetch mapping updates when necessary (elastic#33182)
@martijnvg martijnvg merged commit 41c7fc8 into elastic:ccr Aug 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/CCR Issues around the Cross Cluster State Replication features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants