Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitoring with Prometheus (and metrics-server) #3

Closed
bradenmacdonald opened this issue Dec 6, 2022 · 18 comments
Closed

Monitoring with Prometheus (and metrics-server) #3

bradenmacdonald opened this issue Dec 6, 2022 · 18 comments
Assignees

Comments

@bradenmacdonald
Copy link
Contributor

Use helm subcharts to deploy Prometheus and Grafana onto the chart.

TBD: Can we have them auto-detect and start monitoring Open edX instances on the cluster as they get deployed using Tutor, like how the ingress controller works?

@antoviaque
Copy link

@bradenmacdonald Thanks for creating this task!

@lpm0073 was it the one you are taking on?

@lpm0073
Copy link
Contributor

lpm0073 commented Dec 7, 2022

yes, this is me

@antoviaque
Copy link

@lpm0073 Alright, I'm assigning the issue to you then, if that works :)

@antoviaque
Copy link

@lpm0073 Could you post a status update on this task here? This way we could follow & discuss here async, ahead of the next meeting.

@antoviaque
Copy link

@lpm0073 Are you still interested in this task?

@antoviaque
Copy link

Recap from the meeting: now @lpm0073 is unblocked to work on this ticket, based if I understood correctly on the autoscaling work from @jfavellar90 in #2

@antoviaque antoviaque mentioned this issue Feb 8, 2023
@antoviaque antoviaque moved this to Backlog in DevOps Working Group Feb 8, 2023
@antoviaque antoviaque moved this from Backlog to In Progress in DevOps Working Group Feb 8, 2023
@felipemontoya
Copy link
Member

@lpm0073 could you give us an update on this? are you interested in pursuing this still?

@lpm0073
Copy link
Contributor

lpm0073 commented Feb 27, 2023

i'm beginning today. i'll start here, focusing on the Karpenter dependencies, which include:

Kubecost effectively shares the same set of dependencies, so i'll use this as a guide for scaffolding purposes. Separately, when running on AWS EKS, these supporting systems will benefit from kube-proxy and coredns, so i'll look into whether we can detect if these are running, and if not then try to at least echo something to the console.

@bradenmacdonald
Copy link
Contributor Author

@lpm0073 Thanks! Note that metrics-server and VPA are already being worked on in #17

@lpm0073
Copy link
Contributor

lpm0073 commented Feb 27, 2023

Question to the group: aside from the helm charts, there are a few AWS resources that need to exist, and need to be provided to the helm chart:

  • IAM role for service accounts
  • EC2 instance profile
  • EC2 tagging role and policy attachment

I have these Terraform scripts. how should i incorporate these into this repo?

@bradenmacdonald
Copy link
Contributor Author

@lpm0073 I put some DigitalOcean example Terraform Scripts in https://github.com/openedx/openedx-k8s-harmony/tree/main/infra-example ; you could rename that folder to infra-example-do and create a new infra-example-aws folder with the AWS terraform.

@lpm0073
Copy link
Contributor

lpm0073 commented Mar 6, 2023

confirming that pr #17 takes care of metrics-server and vpa dependencies for this issue.

@felipemontoya
Copy link
Member

After PR#17 is merged we will need to create a new PR with the helm charts for grafana and prometheus

@bradenmacdonald bradenmacdonald changed the title Monitoring with Prometheus and Grafana Monitoring with Prometheus and metrics-server Mar 21, 2023
@adzuci
Copy link

adzuci commented Mar 21, 2023

Hi there! Though I don't have a lot of context currently, I wanted to mention that 2U has looked into the way FairwindsOps enables chart users to opt to toggle on and off the installation of Prometheus via the prometheus-metrics.installPrometheusServer flag in https://github.com/FairwindsOps/charts/tree/master/stable/insights-agent

I would be interested in talking more about how the over Slack or a call if others would.

@antoviaque
Copy link

@adzuci Did you get the information and discussions you wanted during the last meeting about this?

Note that @bradenmacdonald has also created a dedicated task to follow-up on topics of interest for you and 2U, at #28 - comments there are welcomed! Or during the meeting later today.

@felipemontoya
Copy link
Member

@lpm0073 we are getting close to finisht the autoscaling part of the charts which were closely related to this.

Are you interested in pursuing this further?

@felipemontoya felipemontoya changed the title Monitoring with Prometheus and metrics-server Monitoring with Prometheus (and metrics-server) May 2, 2023
@felipemontoya
Copy link
Member

@lpm0073 following the meeting we will split this ticket into two less ambiguous issues. Please comment if you have a different opinion

@felipemontoya
Copy link
Member

Now that the issues for grafana and prometheus have been created we can go ahead and close this.

@github-project-automation github-project-automation bot moved this from In Progress to Done in DevOps Working Group May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

5 participants