[Action] Separate nodes for Falco and our internal stack #30

rossf7 · 2024-01-23T16:38:46Z

This issue is to create separate kubernetes nodes for our internal stack (Flux / Prometheus) and Falco (first project we're measuring).

Node isolation is important to ensure we can measure the footprint of projects accurately.

See falcosecurity/cncf-green-review-testing#2 for Falco node requirements

Node requirements

The nodes will be managed by opentofu and have these names and node labels.

green-reviews-worker-internal1

cncf-project: wg-green-reviews
cncf-project-sub: internal

green-reviews-worker-falco1

cncf-project: falco
cncf-project-sub: falco-driver-modern-ebpf

We will start with using node labels and selectors for placing pods on nodes ~~but we may also need to introduce node taints and tolerations~~.

The text was updated successfully, but these errors were encountered:

dipankardas011 · 2024-01-24T16:49:34Z

I think I can help

rossf7 · 2024-01-24T17:32:45Z

@dipankardas011 Thank you! My initial thought on this was to replace the list var of worker nodes here with a map that includes the labels. WDYT?

We would need to use node selectors for Prometheus, Flux and any other components so they run on the internal node.

Another approach is to add a taint to the falco node and ask the falco team to add a toleration in https://github.com/falcosecurity/cncf-green-review-testing

cc @nikimanoledaki @AntonioDiTuri

dipankardas011 · 2024-01-24T18:19:22Z

yes we can do using node label selection or taints and tolerations

Actually I was trying out a specific problem related to this
wherein I wanted t oschedule the pod to only controlplane nodes given traints and tolerations

https://github.com/kubesimplify/ksctl/blob/ade22eebe56a3d79dee4892b2b0b331ff71b47ef/internal/k8sdistros/universal/ksctl.go#L28-L68

https://github.com/kubesimplify/ksctl/blob/ade22eebe56a3d79dee4892b2b0b331ff71b47ef/internal/k8sdistros/universal/ksctl.go#L88-L95

may be you can tell if it helps in this issue

rossf7 · 2024-01-25T14:49:08Z

@dipankardas011 That's a nice approach and in future we may want to run some workloads on our control plane node if we start to max out the "system" node where we run Prometheus, Flux etc.

A downside I see with a taint per project is we need to run Kepler on all nodes. So we'd need to add tolerations for all the taints.

How about we start with adding node selectors to the kube-prometheus-stack helm release and the flux bootstrap?

https://fluxcd.io/flux/installation/configuration/boostrap-customization/

If adding a node selector for each component becomes too hard to manage we can look at alternatives later.

dipankardas011 · 2024-02-03T18:15:52Z

I have seen usage of scheduling profile (to reduce the usage of node selector and ... by just modifying the scheduling pofile)
but it has a major downside (no support for daemonset pods)
https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#node-affinity-per-scheduling-profile

check for the NOTEs section

dipankardas011 · 2024-02-03T18:27:00Z

Query

I need to modify the fluxcd manifest for installing prometheus, kepler, ...
even if we do that prometheus exporter will be a daemonset thus present in every node, not sure with kepler!

cc @rossf7 @nikimanoledaki @AntonioDiTuri

rossf7 · 2024-02-04T21:30:57Z

I need to modify the fluxcd manifest for installing prometheus, kepler, ...
even if we do that prometheus exporter will be a daemonset thus present in every node, not sure with kepler!

@dipankardas011 The node selectors are needed in the kube-prometheus-stack helm release and also for the flux components.

fluxcd/flux2#2252 (comment)
https://fluxcd.io/flux/installation/configuration/boostrap-customization/

even if we do that prometheus exporter will be a daemonset thus present in every node, not sure with kepler!

It's fine for the kepler DS to schedule pods on all nodes. This is so we can measure the overall energy consumption of the cluster.

rossf7 added help wanted Extra attention is needed board/wg-green-reviews priority/critical Top priority area/cluster good first issue Good for newcomers priority/important-soon and removed priority/critical Top priority labels Jan 23, 2024

nikimanoledaki added this to the Measure the cloud native sustainability footprint of Falco manually milestone Jan 24, 2024

rossf7 assigned dipankardas011 Jan 24, 2024

rossf7 removed good first issue Good for newcomers help wanted Extra attention is needed labels Jan 24, 2024

rossf7 mentioned this issue Jan 24, 2024

[Action] Make Grafana dashboards publicly accessible from cluster #31

Open

4 tasks

nikimanoledaki added this to TAG-Environmental-Sustainability Jan 30, 2024

nikimanoledaki moved this to Backlog in TAG-Environmental-Sustainability Jan 30, 2024

nikimanoledaki moved this from Backlog to In Progress in TAG-Environmental-Sustainability Jan 30, 2024

rossf7 changed the title ~~[Action] Separate knodes for Falco and our internal stack~~ [Action] Separate nodes for Falco and our internal stack Feb 4, 2024

dipankardas011 mentioned this issue Feb 5, 2024

feat: added nodeSelector based on k8s node labels for kube-prometheus #44

Merged

rossf7 mentioned this issue Feb 13, 2024

feat: Make node labels and plan configurable #56

Merged

nikimanoledaki mentioned this issue Feb 16, 2024

[Action] Decide framework for benchmark tests falcosecurity/cncf-green-review-testing#16

Closed

rossf7 mentioned this issue Feb 19, 2024

feat: Add node-selector to flux deployments #62

Merged

rossf7 closed this as completed in #62 Feb 20, 2024

github-project-automation bot moved this from In Progress to Done in TAG-Environmental-Sustainability Feb 20, 2024

leonardpahlke removed this from TAG-Environmental-Sustainability Apr 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Action] Separate nodes for Falco and our internal stack #30

[Action] Separate nodes for Falco and our internal stack #30

rossf7 commented Jan 23, 2024 •

edited

Loading

dipankardas011 commented Jan 24, 2024

rossf7 commented Jan 24, 2024

dipankardas011 commented Jan 24, 2024

rossf7 commented Jan 25, 2024

dipankardas011 commented Feb 3, 2024 •

edited

Loading

dipankardas011 commented Feb 3, 2024 •

edited

Loading

rossf7 commented Feb 4, 2024

[Action] Separate nodes for Falco and our internal stack #30

[Action] Separate nodes for Falco and our internal stack #30

Comments

rossf7 commented Jan 23, 2024 • edited Loading

Node requirements

dipankardas011 commented Jan 24, 2024

rossf7 commented Jan 24, 2024

dipankardas011 commented Jan 24, 2024

rossf7 commented Jan 25, 2024

dipankardas011 commented Feb 3, 2024 • edited Loading

dipankardas011 commented Feb 3, 2024 • edited Loading

Query

rossf7 commented Feb 4, 2024

rossf7 commented Jan 23, 2024 •

edited

Loading

dipankardas011 commented Feb 3, 2024 •

edited

Loading

dipankardas011 commented Feb 3, 2024 •

edited

Loading