-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-17569 Change Code Logic for Generating Block Reconstruction Work #6924
base: trunk
Are you sure you want to change the base?
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
7232071
to
5de85a0
Compare
@jojochuang @zhe-thoughts Would you please take a look for this? |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
1c11467
to
784f5e3
Compare
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Description of PR
The
RedundancyMonitor
is a Daemon which will sleep 3s for each tick.In
computeBlockReconstructionWork(int blocksToProcess)
, it will usechooseLowRedundancyBlocks()
to firstly find the candidate low-redundancy blocks(the max number is restricted parameterblocksToProcess
) and then usecomputeReconstructionWorkForBlocks()
to compute the reconstruction.But in some cases, the candidate low-redundancy blocks returned by
chooseLowRedundancyBlocks()
will be skipped for reconstruction scheduling for different reasons(source unavailable, target not found, validation failed, etc), but some other low-priority blocks which is able for reconstruction has to wait for many rounds before scheduled.How it happened in my case
In my case, I have a 7 datanodes cluster(
a1 ~ a5
,b1 ~ b2
)) and I want to add a new datanodeb3
to it and at the same time decommissiona1 ~ a5
.I find that the decommission takes one week to finish(less than 10000 blocks on each node). and I find below logs:
These blocks cannot be scheduled for reconstruction, but used up the quota (
blocksToProcess
) in each tick, delayed the replicas on the decommissioning nodes to be scheduled for reconstruction, thus the decommission becomes very long-tail.So my solution is, when we meet the low-redundancy blocks to be skipped for reconstruction, move fast-forward inside current tick to check other low-redundancy blocks to schedule reconstruction for them.
Of course, the total number of blocks scheduled for reconstruction successfully will be restricted by parameter
blocksToProcess
.How was this patch tested
In the Unit Test, we mock the case that some source nodes so busy that they are skipped as the source nodes for some high-priority blocks, so their reconstruction scheduling are skipped, then computeBlockReconstructionWork() will successively check other blocks which could be scheduled for reconstruction in a round, instead of sleep and wait for next round to check.
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?