Skip to content

Commit

Permalink
docs(kraft): adds a procedure for zookeeper to kraft migration (#9633)
Browse files Browse the repository at this point in the history
Signed-off-by: prmellor <[email protected]>
Signed-off-by: PaulRMellor <[email protected]>
  • Loading branch information
PaulRMellor authored Mar 2, 2024
1 parent e1bcc24 commit e266f0a
Show file tree
Hide file tree
Showing 4 changed files with 291 additions and 10 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@
[role="_abstract"]
If you are using a ZooKeeper-based Kafka cluster, an upgrade requires an update to the Kafka version and the inter-broker protocol version.

If you want to switch a Kafka cluster from using ZooKeeper for metadata management to operating in KRaft mode, the steps must be performed separately from the upgrade.
For information on migrating to a KRaft-based cluster, see xref:proc-deploy-migrate-kraft-str[].

include::../../modules/upgrading/ref-upgrade-kafka-versions.adoc[leveloffset=+1]
include::../../modules/upgrading/con-upgrade-older-clients.adoc[leveloffset=+1]

Expand Down
2 changes: 2 additions & 0 deletions documentation/deploying/deploying.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ include::modules/deploying/proc-deploy-cluster-operator-helm-chart.adoc[leveloff
include::modules/operators/ref-operator-cluster-feature-gates.adoc[leveloffset=+1]
//feature gate release lifecycle
include::modules/operators/ref-operator-cluster-feature-gate-releases.adoc[leveloffset=+2]
//migrating to KRaft
include::modules/deploying/proc-deploy-migrate-kraft.adoc[leveloffset=+1]
//configuration of components
include::assemblies/configuring/assembly-config.adoc[leveloffset=+1]
//creating topics
Expand Down
274 changes: 274 additions & 0 deletions documentation/modules/deploying/proc-deploy-migrate-kraft.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,274 @@
// Module included in the following assemblies:
//
// deploying/deploying.adoc

[id='proc-deploy-migrate-kraft-{context}']
= Migrating to KRaft mode

[role="_abstract"]
If you are using ZooKeeper for metadata management in your Kafka cluster, you can migrate to using Kafka in KRaft mode.
KRaft mode replaces ZooKeeper for distributed coordination, offering enhanced reliability, scalability, and throughput.

During the migration, you install a quorum of controller nodes as a node pool, which replaces ZooKeeper for management of your cluster.
You enable KRaft migration in the cluster configuration by applying the `strimzi.io/kraft="migration"` annotation.
After the migration is complete, you switch the brokers to using KRaft and the controllers out of migration mode using the `strimzi.io/kraft="enabled"` annotation.

Before starting the migration, verify that your environment can support Kafka in KRaft mode, as there are a number of xref:ref-operator-use-kraft-feature-gate-str[limitations].
Note also, the following:

* Migration is only supported on dedicated controller nodes, not on nodes with dual roles as brokers and controllers.
* Throughout the migration process, ZooKeeper and controller nodes operate in parallel for a period, requiring sufficient compute resources in the cluster.

.Prerequisites

* You must be using Strimzi 0.40 or newer with Kafka 3.7.0 or newer. If you are using an earlier version of Strimzi or Apache Kafka, upgrade before migrating to KRaft mode.
* Verify that the ZooKeeper-based deployment is operating without the following, as they are not supported in KRaft mode:
** The Topic Operator running in bidirectional mode. It should either be in unidirectional mode or disabled.
** JBOD storage. While the `jbod` storage type can be used, the JBOD array must contain only one disk.
* The Cluster Operator that manages the Kafka cluster is running.
* The Kafka cluster deployment uses Kafka node pools.
+
If your ZooKeeper-based cluster is already using node pools, it is ready to migrate.
If not, you can xref:proc-migrating-clusters-node-pools-str[migrate the cluster to use node pools].
To migrate when the cluster is not using node pools, brokers must be contained in a `KafkaNodePool` resource configuration that is assigned a `broker` role and has the name `kafka`.
Support for node pools is enabled in the `Kafka` resource configuration using the `strimzi.io/node-pools: enabled` annotation.

In this procedure, the Kafka cluster name is `my-cluster`, which is located in the `my-project` namespace.
The name of the controller node pool created is `controller`.
The node pool for the brokers is called `kafka`.

.Procedure

. For the Kafka cluster, create a node pool with a `controller` role.
+
The node pool adds a quorum of controller nodes to the cluster.
+
.Example configuration for a controller node pool
[source,yaml,subs="+attributes"]
----
apiVersion: {KafkaNodePoolApiVersion}
kind: KafkaNodePool
metadata:
name: controller
labels:
strimzi.io/cluster: my-cluster
spec:
replicas: 3
roles:
- controller
storage:
type: jbod
volumes:
- id: 0
type: persistent-claim
size: 20Gi
deleteClaim: false
resources:
requests:
memory: 64Gi
cpu: "8"
limits:
memory: 64Gi
cpu: "12"
----
+
NOTE: For the migration, you cannot use a node pool of nodes that share the broker and controller roles.

. Apply the new `KafkaNodePool` resource to create the controllers.
+
Errors related to using controllers in a ZooKeeper-based environment are expected in the Cluster Operator logs.
The errors can block reconciliation.
To prevent this, perform the next step immediately.

. Enable KRaft migration in the `Kafka` resource by setting the `strimzi.io/kraft` annotation to `migration`:
+
[source,shell]
----
kubectl annotate kafka my-cluster strimzi.io/kraft="migration" --overwrite
----
+
.Enabling KRaft migration
[source,yaml,subs="+attributes"]
----
apiVersion: {KafkaApiVersion}
kind: Kafka
metadata:
name: my-cluster
namespace: my-project
annotations:
strimzi.io/kraft="migration"
# ...
----
+
Applying the annotation to the `Kafka` resource configuration starts the migration.

. Check the controllers have started and the brokers have rolled:
+
[source,shell]
----
kubectl get pods -n my-project
----
+
.Output shows nodes in broker and controller node pools
[source,shell]
----
NAME READY STATUS RESTARTS
my-cluster-kafka-0 1/1 Running 0
my-cluster-kafka-1 1/1 Running 0
my-cluster-kafka-2 1/1 Running 0
my-cluster-controller-3 1/1 Running 0
my-cluster-controller-4 1/1 Running 0
my-cluster-controller-5 1/1 Running 0
# ...
----

. Check the status of the migration:
+
[source,shell]
----
kubectl get kafka my-cluster -n my-project -w
----
+
.Updates to the metadata state
[source,shell]
----
NAME ... METADATA STATE
my-cluster ... Zookeeper
my-cluster ... KRaftMigration
my-cluster ... KRaftDualWriting
my-cluster ... KRaftPostMigration
----
+
`METADATA STATE` shows the mechanism used to manage Kafka metadata and coordinate operations.
At the start of the migration this is `ZooKeeper`.
+
--
* `ZooKeeper` is the initial state when metadata is only stored in ZooKeeper.
* `KRaftMigration` is the state when the migration is in progress.
The flag to enable ZooKeeper to KRaft migration (`zookeeper.metadata.migration.enable`) is added to the brokers and they are rolled to register with the controllers.
The migration can take some time at this point depending on the number of topics and partitions in the cluster.
* `KRaftDualWriting` is the state when the Kafka cluster is working as a KRaft cluster,
but metadata are being stored in both Kafka and ZooKeeper.
Brokers are rolled a second time to remove the flag to enable migration.
* `KRaftPostMigration` is the state when KRaft mode is enabled for brokers.
Metadata are still being stored in both Kafka and ZooKeeper.
--
+
The migration status is also represented in the `status.kafkaMetadataState` property of the `Kafka` resource.
+
WARNING: You can xref:proc-deploy-migrate-kraft-rollback-{context}[roll back to using ZooKeeper from this point].
The next step is to enable KRaft.
Rollback cannot be performed after enabling KRaft.

. When the metadata state has reached `KRaftPostMigration`, enable KRaft in the `Kafka` resource configuration by setting the `strimzi.io/kraft` annotation to `enabled`:
+
[source,shell]
----
kubectl annotate kafka my-cluster strimzi.io/kraft="enabled" --overwrite
----
+
.Enabling KRaft migration
[source,yaml,subs="+attributes"]
----
apiVersion: {KafkaApiVersion}
kind: Kafka
metadata:
name: my-cluster
namespace: my-project
annotations:
strimzi.io/kraft="enabled"
# ...
----

. Check the status of the move to full KRaft mode:
+
[source,shell]
----
kubectl get kafka my-cluster -n my-project -w
----
+
.Updates to the metadata state
[source,shell]
----
NAME ... METADATA STATE
my-cluster ... Zookeeper
my-cluster ... KRaftMigration
my-cluster ... KRaftDualWriting
my-cluster ... KRaftPostMigration
my-cluster ... PreKRaft
my-cluster ... KRaft
----
+
--
* `PreKRaft` is the state when all ZooKeeper-related resources have been automatically deleted.
* `KRaft` is the final state (after the controllers have rolled) when the KRaft migration is finalized.
--
+
NOTE: Depending on how `deleteClaim` is configured for ZooKeeper, its Persistent Volume Claims (PVCs) and persistent volumes (PVs) may not be deleted.
`deleteClaim` specifies whether the PVC is deleted when the cluster is uninstalled. The default is `false`.

. Remove any ZooKeeper-related configuration from the `Kafka` resource.
+
If present, you can remove the following:
+
* `log.message.format.version`
* `inter.broker.protocol.version`
* `spec.zookeeper.*` properties
+
Removing `log.message.format.version` and `inter.broker.protocol.version` causes the brokers and controllers to roll again.
Removing ZooKeeper properties removes any warning messages related to ZooKeeper configuration being present in a KRaft-operated cluster.

[id='proc-deploy-migrate-kraft-rollback-{context}']
.Performing a rollback on the migration

Before the migration is finalized by enabling KRaft in the `Kafka` resource, and the state has moved to the `KRaft` state, you can perform a rollback operation as follows:

. Apply the `strimzi.io/kraft="rollback"` annotation to the `Kafka` resource to roll back the brokers.
+
[source,shell]
----
kubectl annotate kafka my-cluster strimzi.io/kraft="rollback" --overwrite
----
+
.Rolling back KRaft migration
[source,yaml,subs="+attributes"]
----
apiVersion: {KafkaApiVersion}
kind: Kafka
metadata:
name: my-cluster
namespace: my-project
annotations:
strimzi.io/kraft="rollback"
# ...
----
+
The migration process must be in the `KRaftPostMigration` state to do this.
The brokers are rolled back so that they can be connected to ZooKeeper again and the state returns to `KRaftDualWriting`.

. Delete the controllers node pool:
+
[source,shell]
----
kubectl delete KafkaNodePool controller -n my-project
----

. Apply the `strimzi.io/kraft="disabled"` annotation to the `Kafka` resource to return the metadata state to `ZooKeeper`.
+
[source,shell]
----
kubectl annotate kafka my-cluster strimzi.io/kraft="disabled" --overwrite
----
+
.Switching back to using ZooKeeper
[source,yaml,subs="+attributes"]
----
apiVersion: {KafkaApiVersion}
kind: Kafka
metadata:
name: my-cluster
namespace: my-project
annotations:
strimzi.io/kraft="disabled"
# ...
----
22 changes: 12 additions & 10 deletions documentation/modules/managing/con-custom-resources-status.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -98,11 +98,12 @@ status:
- lastTransitionTime: '2023-01-20T17:56:29.396588Z'
status: 'True'
type: Ready # <3>
kafkaVersion: {DefaultKafkaVersion} # <4>
kafkaNodePools: # <5>
kafkaMetadataState: KRaft # <4>
kafkaVersion: {DefaultKafkaVersion} # <5>
kafkaNodePools: # <6>
- name: broker
- name: controller
listeners: # <6>
listeners: # <7>
- addresses:
- host: my-cluster-kafka-bootstrap.prm-project.svc
port: 9092
Expand Down Expand Up @@ -140,16 +141,17 @@ status:
-----END CERTIFICATE-----
name: external4
observedGeneration: 3 # <7>
operatorLastSuccessfulVersion: {ProductVersion} # <8>
observedGeneration: 3 # <8>
operatorLastSuccessfulVersion: {ProductVersion} # <9>
----
<1> The Kafka cluster ID.
<2> Status `conditions` describe the current state of the Kafka cluster.
<3> The `Ready` condition indicates that the Cluster Operator considers the Kafka cluster able to handle traffic.
<4> The version of Kafka being used by the Kafka cluster.
<5> The node pools belonging to the Kafka cluster.
<6> The `listeners` describe Kafka bootstrap addresses by type.
<7> The `observedGeneration` value indicates the last reconciliation of the `Kafka` custom resource by the Cluster Operator.
<8> The version of the operator that successfully completed the last reconciliation.
<4> Kafka metadata state that shows the mechanism used (KRaft or ZooKeeper) to manage Kafka metadata and coordinate operations.
<5> The version of Kafka being used by the Kafka cluster.
<6> The node pools belonging to the Kafka cluster.
<7> The `listeners` describe Kafka bootstrap addresses by type.
<8> The `observedGeneration` value indicates the last reconciliation of the `Kafka` custom resource by the Cluster Operator.
<9> The version of the operator that successfully completed the last reconciliation.

NOTE: The Kafka bootstrap addresses listed in the status do not signify that those endpoints or the Kafka cluster is in a `Ready` state.

0 comments on commit e266f0a

Please sign in to comment.