Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add statefulset for metrics generator #2533

Merged
merged 14 commits into from
Jul 6, 2023
Merged

Conversation

zalegrala
Copy link
Contributor

@zalegrala zalegrala commented Jun 2, 2023

What this PR does:

Updates microservices jsonnet to support a statefulset for the metrics_generator component.

BREAKING CHANGE:

To support a new processor, the metrics generator has been converted from a deployment into a statefulset with a PVC. This will require manual intervention in order to migrate successfully and avoid downtime. Note that currently both a deployment and a statefulset will be managed by the jsonnet for a period of time, after which we will delete the deployment from this repo and you will need to delete user-side references to the tempo_metrics_generator_deployment, as well as delete the deployment itself.

First, just as with the ingester configuration, you will need to specify a pvc_size and a pvc_storage_class for the metrics_generator PVC configuration. For example:

{
  _config+:: {
    metrics_generator+: {
      pvc_size: '10Gi',
      pvc_storage_class: 'local-path',
    },
  }
}

Any user-side overrides for the tempo_metrics_generator_deployment need to be considered for the tempo_metrics_generator_statefulset object.

Currently, the deplyment replicas are set to 0 by default in the jsonnet, while the statefulset inherits replica configuration from the $._config object. To keep the deployment replicas around and make the transition without an outage, you can keep the replicas by overriding the following key.

  tempo_metrics_generator_deployment+:
    { spec+: { replicas: $._config.metrics_generator.replicas } },

This will maintain the same number of replicas you have specified in the configuration for the statefulset. Note that this will be approximately double the resource requirements for a period of time while you stabilize the ring and prepare to scale down the deployment.

You can check memberlist either with the tempo_memberlist_client_cluster_members_count metric, or you can visit the http://tempo:3200/memberlist page to see that metrics generator instances for both the statefulset and deployment are available.

Once all instances are healthy, you can begin to scale down your deployment and delete the above reference to the tempo_metrics_generator_deployment.

Without handling the above, a brief outage will be incurred for the metrics-generator, but everything should be functioning again once the statefulset for the metrics-generator is up and available.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@zalegrala zalegrala marked this pull request as ready for review June 12, 2023 14:56
Copy link
Member

@mapno mapno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explanation sounds good to me, very detailed. The migration plan makes sense 👍

It's missing the new statefulset in jsonnet-compiled, no? Otherwise, it LGTM

@zalegrala
Copy link
Contributor Author

Nice, thanks @mapno. I just needed to add that new file. I've pushed a commit for this. Let me know if you spot anything else I missed.

CHANGELOG.md Show resolved Hide resolved
@zalegrala zalegrala merged commit 130de91 into grafana:main Jul 6, 2023
@zalegrala zalegrala deleted the generatorSTS branch July 6, 2023 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants