Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional segment load/drop management without zookeeper using http #4966

Merged
merged 11 commits into from
Oct 19, 2017

Conversation

himanshug
Copy link
Contributor

same as #4874

TODOs:

  • Test on a cluster with decent load.

This patch introduces following changes.

On Historical Side:

  1. All non-zookeeper related code is extracted out of ZkCoordinator and moved into new class SegmentLoadDropHandler. ZkCoordinator now uses SegmentLoadDropHandler to delegate load/drop requests.
  2. A new method ListenableFuture SegmentLoadDropHandler.processBatch(List<DataSegmentChangeRequest>) is added to support http endpoint for segment load/drop management.
  3. New endpoint POST /druid-internal/v1/segments/changeRequests is added in SegmentListerResource that uses ListenableFuture SegmentLoadDropHandler.processBatch(List<DataSegmentChangeRequest>) to delegate batch of load/drop requests received. Here is a copy-paste of javadoc for this endpoint.
  /**
   * This endpoint is used by HttpLoadQueuePeon to assign segment load/drop requests batch. This endpoint makes the
   * client wait till one of the following events occur. Note that this is implemented using async IO so no jetty
   * threads are held while in wait.
   *
   * (1) Given timeout elapses.
   * (2) Some load/drop request completed.
   *
   * It returns a map of "load/drop request -> SUCCESS/FAILED/PENDING status" for each request in the batch.
   */
  1. druid.segmentCache.numLoadingThreads configuration is revived and must be greater than the "druid.coordinator.loadqueuepeon.http.batchSize described later.

On Coordinator Side:

  1. CuratorLoadQueuePeon is created by copying existing LoadQueuePeon class.
  2. LoadQueuePeon is made abstract class.
  3. HttpLoadQueuePeon is created to do segment load/drop management via http. It uses the new endpoint introduced above to send batch of load/drop requests to Historical node.
  4. Following new configurations are introduced.
    druid.coordinator.loadqueuepeon.type = http or curator, curator by default
    druid.coordinator.loadqueuepeon.http.batchSize = number of load/drop to try and process in parallel, must be less than druid.segmentCache.numLoadingThreads on historical
    druid.coordinator.loadqueuepeon.http.repeatDelay = delay Duration for periodic check for new load/drop request to send to historical. note this could be very large given that same code is executed whenever a new load/drop is requested to HttpLoadQueuePeon without waiting for schedule to kick in, default 1 minute
    druid.coordinator.loadqueuepeon.http.hostTimeout = default 5 minutes, timeout used to be specified on new endpoint introduced for batch load/drop request.

In the long run, I plan to remove CuratorLoadQueuePeon and ZkCoordinator. They are being kept now only to be backward compatible and for HttpLoadQueuePeon to prove itself in production.

Documentation for HttpLoadQueuePeon is intentionally left out.

@himanshug
Copy link
Contributor Author

himanshug commented Oct 17, 2017

i inadvertently closed #4874 and couldn't re-open, hence new PR

@cheddar
Copy link
Contributor

cheddar commented Oct 19, 2017

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants