Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to move data between JBOD disks using Cruise Control #10644

Merged
merged 9 commits into from
Nov 1, 2024

Conversation

ShubhamRwt
Copy link
Contributor

@ShubhamRwt ShubhamRwt commented Sep 26, 2024

Type of change

This PR adds the ability to move data between two JBOD disks using Cruise Control. To do the same, we have introduced a new mode called remove-disks.
You can use it like this:

yaml
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaRebalance
metadata:
  name: my-rebalance
  labels:
    strimzi.io/cluster: my-cluster
spec:
  # setting the mode as `remove-disks` to move data between the JBOD disks
  mode: remove-disks
  # providing the list of brokers, and the corresponding volumes from which you want to move the replicas
  moveReplicasOffVolumes:
    - brokerId: 0
      volumeIds: [1, 2]
    - brokerId: 2
      volumeIds: [1]
# ...
  • Bugfix
  • Enhancement / new feature
  • Refactoring
  • Documentation

Description

Please describe your pull request

Checklist

Please go through this checklist and make sure all applicable tasks have been done

  • Write tests
  • Make sure all tests pass
  • Update documentation
  • Check RBAC rights for Kubernetes / OpenShift roles
  • Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally
  • Reference relevant issue(s) and close them after merging
  • Update CHANGELOG.md
  • Supply screenshots for visual changes, such as Grafana dashboards

@ShubhamRwt ShubhamRwt force-pushed the moveReplicasOffDisk branch 4 times, most recently from 923b3a0 to 01ee408 Compare September 30, 2024 11:55
@ShubhamRwt ShubhamRwt requested a review from scholzj September 30, 2024 12:19
@ShubhamRwt ShubhamRwt marked this pull request as ready for review September 30, 2024 12:19
@ShubhamRwt ShubhamRwt added this to the 0.44.0 milestone Sep 30, 2024
@scholzj
Copy link
Member

scholzj commented Sep 30, 2024

/azp run regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments. But I'm not really an CC expert. So the CC related parts have to be reviewed by @ppatierno as the SME.

private List<Integer> volumeIds;
private Map<String, Object> additionalProperties;

@Description("Id of the broker which contains the disk from which you want to move the the partition replicas from")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and in the other doc comments -> I think we use ID in the docs. Id is only the Java capitalization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will correct it

@@ -31,6 +31,17 @@ public class KafkaRebalanceSpec extends Spec {
private KafkaRebalanceMode mode = KafkaRebalanceMode.FULL;
private List<Integer> brokers;

@Description("List of the brokers and the volumes corresponding to them whose replicas needs to be moved")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@Description("List of the brokers and the volumes corresponding to them whose replicas needs to be moved")
@Description("List of the brokers and the volumes corresponding to them whose replicas need to be moved")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe tweak it a bit more:

@Description("List of brokers and their corresponding volumes from which replicas need to be moved.")

Comment on lines 34 to 44
@Description("List of the brokers and the volumes corresponding to them whose replicas needs to be moved")
public List<BrokerAndVolumeIds> getMoveReplicasOffVolumes() {
return moveReplicasOffVolumes;
}

public void setMoveReplicasOffVolumes(List<BrokerAndVolumeIds> moveReplicasOffVolumes) {
this.moveReplicasOffVolumes = moveReplicasOffVolumes;
}

private List<BrokerAndVolumeIds> moveReplicasOffVolumes;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the separation/order of the field declaration and the methods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

if (brokerAndVolumeIds != null && !brokerAndVolumeIds.isEmpty()) {
((RemoveDisksOptions.RemoveDisksOptionsBuilder) rebalanceOptionsBuilder).withBrokersandVolumeIds(brokerAndVolumeIds);
} else {
throw new IllegalArgumentException("The brokers and volume id list is mandatory when using the " + mode.toValue() + " rebalancing mode");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ID in capital again

@scholzj scholzj requested a review from PaulRMellor October 1, 2024 09:33
Comment on lines 34 to 44
@Description("List of the brokers and the volumes corresponding to them whose replicas needs to be moved")
public List<BrokerAndVolumeIds> getMoveReplicasOffVolumes() {
return moveReplicasOffVolumes;
}

public void setMoveReplicasOffVolumes(List<BrokerAndVolumeIds> moveReplicasOffVolumes) {
this.moveReplicasOffVolumes = moveReplicasOffVolumes;
}

private List<BrokerAndVolumeIds> moveReplicasOffVolumes;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@ppatierno
Copy link
Member

@ShubhamRwt I would rebase against main, because a refactoring around getting task status of rebalance was just merged. It would be better for you to have the code up to date while addressing comments on this PR. Thanks!

@ShubhamRwt
Copy link
Contributor Author

Yep I am working on that @ppatierno, trying to address all the comments and rebase. After that I will test things in a cluster to be sure everything works

Copy link
Contributor

@PaulRMellor PaulRMellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a few suggestions on the descriptions

@@ -31,6 +31,17 @@ public class KafkaRebalanceSpec extends Spec {
private KafkaRebalanceMode mode = KafkaRebalanceMode.FULL;
private List<Integer> brokers;

@Description("List of the brokers and the volumes corresponding to them whose replicas needs to be moved")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe tweak it a bit more:

@Description("List of brokers and their corresponding volumes from which replicas need to be moved.")

@ShubhamRwt ShubhamRwt force-pushed the moveReplicasOffDisk branch 2 times, most recently from 0b03a1e to f46d8f5 Compare October 16, 2024 09:05
@ppatierno
Copy link
Member

@ShubhamRwt could you rebase against main, I merged another PR fixing some reconciliation issues on KafkaRebalance (You got conflicts as well).

@ShubhamRwt ShubhamRwt force-pushed the moveReplicasOffDisk branch 2 times, most recently from 6214287 to f626564 Compare October 18, 2024 13:10
Copy link
Member

@ppatierno ppatierno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@ppatierno
Copy link
Member

@scholzj @PaulRMellor could you have another pass to check if Shubham addressed your comments as well?

Copy link
Contributor

@PaulRMellor PaulRMellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
I left a few suggestions around changing ID's to IDs to be consistent

if (brokerAndVolumeIds != null && !brokerAndVolumeIds.isEmpty()) {
((RemoveDisksOptions.RemoveDisksOptionsBuilder) rebalanceOptionsBuilder).withBrokersandVolumeIds(brokerAndVolumeIds);
} else {
throw new IllegalArgumentException("The brokers and volume ID's list is mandatory when using the " + mode.toValue() + " rebalancing mode");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
throw new IllegalArgumentException("The brokers and volume ID's list is mandatory when using the " + mode.toValue() + " rebalancing mode");
throw new IllegalArgumentException("The brokers and volume IDs list is mandatory when using the " + mode.toValue() + " rebalancing mode");

@@ -73,6 +75,23 @@ public PathBuilder withParameter(CruiseControlParameters param, List<String> val
return this;
}

/**
* Converts the list of broker and volume ID's to a list of string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Converts the list of broker and volume ID's to a list of string
* Converts the list of broker and volume IDs to a list of string

* Rebalance options for removing disks from brokers within the Kafka cluster
*/
public class RemoveDisksOptions extends AbstractRebalanceOptions {
/** list with the ID's of the broker and volumes which will be used by remove-disks endpoint */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/** list with the ID's of the broker and volumes which will be used by remove-disks endpoint */
/** list with the IDs of the broker and volumes which will be used by remove-disks endpoint */

private final List<BrokerAndVolumeIds> brokerAndVolumeIds;

/**
* @return List of brokers and volume ID's which will be used by remove-disks endpoint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @return List of brokers and volume ID's which will be used by remove-disks endpoint
* @return List of brokers and volume IDs which will be used by remove-disks endpoint

}

/**
* List of broker and volume ID's to be used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* List of broker and volume ID's to be used
* List of broker and volume IDs to be used

/**
* List of broker and volume ID's to be used
*
* @param brokerAndVolumeIdsList List of broker and volume ID's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param brokerAndVolumeIdsList List of broker and volume ID's
* @param brokerAndVolumeIdsList List of broker and volume IDs

import java.util.Map;

/**
* Configures the broker and Volume ID's for the remove-disks endpoint for Cruise Control
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Configures the broker and Volume ID's for the remove-disks endpoint for Cruise Control
* Configures the broker and Volume IDs for the remove-disks endpoint for Cruise Control

@@ -8229,6 +8230,7 @@ spec:
- full
- add-brokers
- remove-brokers
- remove-disks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this changing here? Has nothing to do with the Kafka CR. I would expect these to be using different types to not have unused fields.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's changing because the autorebalance related API class definition is using the same enum to get the modes (so also "full" is included even if not really supported) :-/ And when adding remove-disks, it's there as well.
The only solution I see is not using the enum for autorebalance but just a String and adding additional checks on the values, or just sticking with what we have.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe better, adding a new KafkaAutoRebalanceMode with just add-brokers and remove-brokers mode.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think you can have a separate Enum with the values supported there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok then, I am going to open a trivial PR for this because it's unrelated to this PR. @ShubhamRwt you can then rebase and continue with this one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened this one for it #10744


@Description("ID of the broker that contains the disk from which you want to move the partition replicas.")
@JsonInclude(JsonInclude.Include.NON_NULL)
@JsonProperty("brokerId")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?


@Description("IDs of the disks from which the partition replicas need to be moved.")
@JsonInclude(JsonInclude.Include.NON_NULL)
@JsonProperty("volumeIds")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

@ShubhamRwt
Copy link
Contributor Author

@scholzj Hi, can we run the pipeline tests in this?

@scholzj
Copy link
Member

scholzj commented Oct 25, 2024

/azp run regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@ShubhamRwt
Copy link
Contributor Author

The regression failure doesn't look related to this PR

@scholzj
Copy link
Member

scholzj commented Oct 27, 2024

The regression failure doesn't look related to this PR

Ok, let's re-run it.

@scholzj
Copy link
Member

scholzj commented Oct 27, 2024

/azp run regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@ShubhamRwt
Copy link
Contributor Author

The test failing in the regression is passing for me locally even when I have rebased to the main branch. @see-quick Do you have any idea why it fails?

@ShubhamRwt
Copy link
Contributor Author

@scholzj I have rebased the PR and made the minor fix. Do you have any other suggestions which I need to fix?

Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ShubhamRwt. LGTM.

@scholzj scholzj merged commit a55eabf into strimzi:main Nov 1, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 0.45.0
Development

Successfully merging this pull request may close these issues.

4 participants