Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not mutate RecoveryResponse #37204

Merged
merged 3 commits into from
Jan 8, 2019
Merged

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Jan 7, 2019

Today we create a global instance of RecoveryResponse then mutate it when executing each recovery step. This is okay for the current sequential recovery flow. However, this is not suitable for an asynchronous recovery which we are targeting. With this commit, we return the result of each step separately and construct a RecoveryResponse at the end.

Relates #37174

@dnhatn dnhatn added >enhancement :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. v7.0.0 v6.7.0 labels Jan 7, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

out.writeVLong(phase1TotalSize);
out.writeVLong(phase1ExistingTotalSize);
out.writeVLong(phase1Time);
out.writeVLong(phase1ThrottlingWaitTime);
if (out.getVersion().before(Version.V_7_0_0)) {
out.writeVLong(0L); // phase1ThrottlingWaitTime - not used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not used here because it's wrongly implemented (we add the throttle time to the source shard instead of the target shard, see createRecoverySourceHandler in PeerRecoverySourceService. Instead I think it should be send back to the target shard and added there at the end of recovery.

Let's keep the phase1ThrottlingWaitTime here for now, and follow-up with a fix for the actual stats.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, we send the source throttle time each FileChunk request to the target

indexState.addSourceThrottling(request.sourceThrottleTimeInNanos());

Should phase1ThrottlingWaitTime be the total throttle time on both source and target? Note that we currently use RecoveryReponse only for logging purpose. Anyway, I added phase1ThrottlingWaitTime back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should phase1ThrottlingWaitTime be the total throttle time on both source and target?

yes, we can transfer the knowledge of the source throttle time as part of the response object back to the target, and then add it to the target throttle time there. An alternative would be to transfer info about source throttle time with each file chunk that we send.

@dnhatn
Copy link
Member Author

dnhatn commented Jan 8, 2019

@ywelsch Thanks for looking. I have addressed your comments.

@dnhatn dnhatn requested a review from ywelsch January 8, 2019 16:15
Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/mutable/mutate/ in PR title :-)

@dnhatn dnhatn changed the title Do not mutable RecoveryResponse Do not mutate RecoveryResponse Jan 8, 2019
@dnhatn
Copy link
Member Author

dnhatn commented Jan 8, 2019

@elasticmachine run gradle build tests 2

@dnhatn
Copy link
Member Author

dnhatn commented Jan 8, 2019

@ywelsch thanks for reviewing.

@dnhatn dnhatn merged commit 87ac310 into elastic:master Jan 8, 2019
@dnhatn dnhatn deleted the recovery-response branch January 8, 2019 21:12
dnhatn added a commit that referenced this pull request Jan 9, 2019
Today we create a global instance of RecoveryResponse then mutate it
when executing each recovery step. This is okay for the current
sequential recovery flow but  not suitable for an asynchronous recovery
which we are targeting. With this commit, we return the result of each
step separately, then construct a RecoveryResponse at the end.

Relates #37174
kovrus added a commit to crate/crate that referenced this pull request Sep 11, 2019
kovrus added a commit to crate/crate that referenced this pull request Sep 11, 2019
kovrus added a commit to crate/crate that referenced this pull request Sep 12, 2019
kovrus added a commit to crate/crate that referenced this pull request Sep 12, 2019
mergify bot pushed a commit to crate/crate that referenced this pull request Sep 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. >enhancement v6.7.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants