Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Action] Separate nodes for Falco and our internal stack #30

Closed
rossf7 opened this issue Jan 23, 2024 · 7 comments · Fixed by #62
Closed

[Action] Separate nodes for Falco and our internal stack #30

rossf7 opened this issue Jan 23, 2024 · 7 comments · Fixed by #62

Comments

@rossf7
Copy link
Contributor

rossf7 commented Jan 23, 2024

This issue is to create separate kubernetes nodes for our internal stack (Flux / Prometheus) and Falco (first project we're measuring).

Node isolation is important to ensure we can measure the footprint of projects accurately.

See falcosecurity/cncf-green-review-testing#2 for Falco node requirements

Node requirements

The nodes will be managed by opentofu and have these names and node labels.

  • green-reviews-worker-internal1
cncf-project: wg-green-reviews
cncf-project-sub: internal
  • green-reviews-worker-falco1
cncf-project: falco
cncf-project-sub: falco-driver-modern-ebpf

We will start with using node labels and selectors for placing pods on nodes but we may also need to introduce node taints and tolerations.

@dipankardas011
Copy link
Contributor

I think I can help

@rossf7
Copy link
Contributor Author

rossf7 commented Jan 24, 2024

@dipankardas011 Thank you! My initial thought on this was to replace the list var of worker nodes here with a map that includes the labels. WDYT?

We would need to use node selectors for Prometheus, Flux and any other components so they run on the internal node.

Another approach is to add a taint to the falco node and ask the falco team to add a toleration in https://github.com/falcosecurity/cncf-green-review-testing

cc @nikimanoledaki @AntonioDiTuri

@dipankardas011
Copy link
Contributor

yes we can do using node label selection or taints and tolerations

Actually I was trying out a specific problem related to this
wherein I wanted t oschedule the pod to only controlplane nodes given traints and tolerations

https://github.com/kubesimplify/ksctl/blob/ade22eebe56a3d79dee4892b2b0b331ff71b47ef/internal/k8sdistros/universal/ksctl.go#L28-L68

https://github.com/kubesimplify/ksctl/blob/ade22eebe56a3d79dee4892b2b0b331ff71b47ef/internal/k8sdistros/universal/ksctl.go#L88-L95

may be you can tell if it helps in this issue

@rossf7
Copy link
Contributor Author

rossf7 commented Jan 25, 2024

@dipankardas011 That's a nice approach and in future we may want to run some workloads on our control plane node if we start to max out the "system" node where we run Prometheus, Flux etc.

A downside I see with a taint per project is we need to run Kepler on all nodes. So we'd need to add tolerations for all the taints.

How about we start with adding node selectors to the kube-prometheus-stack helm release and the flux bootstrap?

https://fluxcd.io/flux/installation/configuration/boostrap-customization/

If adding a node selector for each component becomes too hard to manage we can look at alternatives later.

@dipankardas011
Copy link
Contributor

dipankardas011 commented Feb 3, 2024

I have seen usage of scheduling profile (to reduce the usage of node selector and ... by just modifying the scheduling pofile)
but it has a major downside (no support for daemonset pods)
https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#node-affinity-per-scheduling-profile

check for the NOTEs section

@dipankardas011
Copy link
Contributor

dipankardas011 commented Feb 3, 2024

Query

  1. I need to modify the fluxcd manifest for installing prometheus, kepler, ...
  2. even if we do that prometheus exporter will be a daemonset thus present in every node, not sure with kepler!

cc @rossf7 @nikimanoledaki @AntonioDiTuri

@rossf7 rossf7 changed the title [Action] Separate knodes for Falco and our internal stack [Action] Separate nodes for Falco and our internal stack Feb 4, 2024
@rossf7
Copy link
Contributor Author

rossf7 commented Feb 4, 2024

I need to modify the fluxcd manifest for installing prometheus, kepler, ...
even if we do that prometheus exporter will be a daemonset thus present in every node, not sure with kepler!

@dipankardas011 The node selectors are needed in the kube-prometheus-stack helm release and also for the flux components.

fluxcd/flux2#2252 (comment)
https://fluxcd.io/flux/installation/configuration/boostrap-customization/

even if we do that prometheus exporter will be a daemonset thus present in every node, not sure with kepler!

It's fine for the kepler DS to schedule pods on all nodes. This is so we can measure the overall energy consumption of the cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment