-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
garm
server configuration should be possible via garm-operator
#143
Comments
cc @gabriel-samfira - as you already have similar thoughts about managing garm itself via the operator: would be nice if you could drop some of your thoughts about this in here 🙏 |
I've actually started playing around with this here: https://github.com/gabriel-samfira/garm-operator/tree/add-garm-server-controller However I've been sidetracked a bit. Hopefully I can resume soon. The basic idea is as follows:
This is just a rough idea so far. What do you think? |
i'm not sure if i got the entire idea 🙈 The first few bullet-points in your list are awesome - especially as you probably know best how it should be 😅 are you even go one step further to manage the
|
The controller would manage a StatefulSet which is the garm-server itself. Not sure if it's worth having separate controllers for config map management. I think we could fit that into the reconciliation loop of the garm server controller. the reconciliation loop would basically use the CRD settings to compose the config map, save the config map, update the stateful set. If we add CRs for credentials and providers, we just add those two types in the loop as well. List al creds/providers, get CRD settings, compose config map, update stateful set. The controller can watch for changes to credentials, providers and the garm server CRD (which it owns) and react accordingly. For credentials and providers we need separate controllers/webhooks. I think the best way to go about it would be to create a WiP branch and discuss it once it's ready. Doesn't seem to difficult to write a controller using kubebuilder. An initial draft would be without tests and not necessarily correct, but would give everyone a view of the rough idea. |
hey gabriel, first of all - thanks for the detailed explanation and all the work you already put into we can also jump into a dedicated call and take some time to talk about just some thoughts from our side which came into our mind: Taking this similarity and reflecting to your approach means, it will be possible to offer Pros:
Cons:
Our preferred approach is still the following:
deployment-manifeste.g.: we have to switch the pods apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "garm.fullname" . }}
labels:
{{- include "garm.labels" . | nindent 4 }}
spec:
replicas: 1
strategy:
{{- .Values.deployment.strategy | toYaml | nindent 4 }}
selector:
matchLabels:
{{- include "garm.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "garm.selectorLabels" . | nindent 8 }}
annotations:
{{- if .Values.rx.enabled }}
prometheus.io/scrape: "true"
prometheus.io/path: "/metrics"
prometheus.io/port: "{{ .Values.service.port }}"
prometheus.io/rx-scrape-every-1m: "true" #default
{{- end }}
spec:
serviceAccountName: {{ .Values.serviceAccount.name }}
securityContext:
runAsUser: {{ .Values.securityContext.runAsUser }}
fsGroup: {{ .Values.securityContext.fsGroup }}
{{- if .Values.garm.provider.azure.enabled }}
dnsPolicy: "None"
dnsConfig:
nameservers:
- 8.8.8.8
{{- end }}
containers:
- name: garm
image: "{{ .Values.image.registry }}/{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
command: ["/opt/garm"]
imagePullPolicy: {{ .Values.image.imagePullPolicy }}
ports:
- name: garm
containerPort: 9997
protocol: TCP
livenessProbe:
tcpSocket:
port: 9997
failureThreshold: 4
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 2
readinessProbe:
tcpSocket:
port: 9997
failureThreshold: 4
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 2
resources:
{{- toYaml .Values.resources | nindent 12 }}
volumeMounts:
- name: {{ include "garm.fullname" . }}-garm-config-volume
mountPath: /etc/garm
readOnly: true
{{- if .Values.garm.provider.openstack.enabled }}
- name: {{ include "garm.fullname" . }}-openstack-provider-config-volume
mountPath: /etc/garm/provider-config/openstack
readOnly: true
{{- end }}
{{- if .Values.garm.provider.azure.enabled }}
- name: {{ include "garm.fullname" . }}-azure-provider-config-volume
mountPath: /etc/garm/provider-config/azure
readOnly: true
{{- end }}
{{- if .Values.garm.provider.k8s.enabled }}
- name: {{ include "garm.fullname" . }}-kubernetes-provider-config-volume
mountPath: /etc/garm/provider-config/kubernetes
readOnly: true
{{- end }}
- name: {{ include "garm.fullname" . }}-database
mountPath: "/etc/garm/database"
- name: serviceaccount-volume
mountPath: /var/run/secrets/kubernetes.io/serviceaccount
{{- if .Values.garm.provider.azure.enabled }}
- name: transparent-proxy
image: internal-transparent-proxy:v0.1.12-nonroot
command: ["/bin/bash", "-c"]
args:
- |
cat << EOF > /tmp/transparent-proxy.yaml
port: 3128
disable_sudo: true
debug: true
proxies:
- addresses:
- proxy-server.company.io:3128
EOF
/app/cmd/transparent-proxy.binary -config-file /tmp/transparent-proxy.yaml 2>&1 | tee /tmp/logs/proxy-log.txt &
CHILD_PID=$!
trap "kill -SIGTERM $CHILD_PID" SIGTERM
wait $CHILD_PID
env:
- name: USER
value: "{{ .Values.securityContext.runAsUser }}"
securityContext:
capabilities:
add:
- NET_ADMIN
resources:
requests:
cpu: 10m
memory: 10Mi
limits:
memory: 50Mi
{{- end }}
{{- if .Values.queryExporter.enabled }}
- name: query-exporter
image: "{{ .Values.queryExporter.image.registry }}/{{ .Values.queryExporter.image.repository }}:{{ .Values.queryExporter.image.tag }}"
command: ["query-exporter"]
args:
- "/etc/query-exporter/config.yaml"
- "-H"
- "0.0.0.0"
imagePullPolicy: {{ .Values.queryExporter.image.imagePullPolicy }}
ports:
- name: query-exporter
containerPort: {{ .Values.queryExporter.service.port }}
protocol: TCP
resources:
{{- toYaml .Values.queryExporter.resources | nindent 12 }}
volumeMounts:
- name: query-exporter-config-volume
mountPath: /etc/query-exporter
- name: {{ include "garm.fullname" . }}-database
mountPath: "/etc/garm/database"
{{- end }}
volumes:
{{- if .Values.queryExporter.enabled }}
- name: query-exporter-config-volume
projected:
sources:
- configMap:
name: {{ include "garm.fullname" . }}-query-exporter-config
{{- end}}
- name: {{ include "garm.fullname" . }}-garm-config-volume
projected:
sources:
- secret:
name: {{ include "garm.fullname" . }}-garm-config
{{- if .Values.garm.provider.openstack.enabled }}
- name: {{ include "garm.fullname" . }}-openstack-provider-config-volume
projected:
sources:
- secret:
name: {{ include "garm.fullname" . }}-openstack-provider-cloud-config
- secret:
name: {{ include "garm.fullname" . }}-openstack-provider-config
{{- end }}
{{- if .Values.garm.provider.azure.enabled }}
- name: {{ include "garm.fullname" . }}-azure-provider-config-volume
projected:
sources:
- secret:
name: {{ include "garm.fullname" . }}-azure-provider-config
{{- end }}
{{- if .Values.garm.provider.k8s.enabled }}
- name: {{ include "garm.fullname" . }}-kubernetes-provider-config-volume
projected:
sources:
{{- range $runner_cluster := .Values.garm.provider.k8s.runnerClusters }}
{{- if ne $runner_cluster.name "local"}}
- secret:
name: {{ include "garm.fullname" $ }}-kubernetes-provider-kubeconfig
{{- end }}
- configMap:
name: {{ include "garm.fullname" $ }}-kubernetes-provider-config
{{- end }}
{{- end }}
- name: {{ include "garm.fullname" . }}-database
persistentVolumeClaim:
claimName: {{ include "garm.fullname" . }}
# use a custom token with specified expiration time - this implementation behaves like the default token mechanism
# see: https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#bound-service-account-token-volume
- name: serviceaccount-volume
projected:
sources:
- serviceAccountToken:
path: token
expirationSeconds: 600 # dont set exact to 3607 otherwise the token will be valid for 1 year
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace |
You're right. That makes a lot of sense. A proper helm chart is definitely preferable to reinventing the wheel. In that case, we should be fine with just implementing the resources that were moved to the DB (gh endpoints and credentials - potentially providers in the future), and potentially offer a helm chart that folks can use. It's so easy to over complicate things. Thanks for the reality check! Edit: It makes sense to have any helm chart live in the operator project. GARM should just do what it does. At most it should worry about packaging (deb, rpm, snap, flatpack, plain binary releases, etc). But automation should be separate. |
What is the feature you would like to have?
With the recent changes in
garm
(and also future ideas in the backlog) it will be possible to "configure" agarm-server
instance in a declarative way.The current version of
garm-operator
already takes care of the circumstance if a garm-instance is initialized or not and takes care of the initialization if needed.As we already check on every api-request for an initialed garm-server (func Ensured Auth) we should at least decouple this implementation and use some existing primitives of kubernetes to make the code a bit more easier to read.
I would like to see a dedicated controller with is responsible for the generic garm-server management and controllers for scoped resources (Separation of concerns).
Current known resources
Anything else you would like to add?
As the configuration of the github backend has already moved out of a config file and moved into the api (with cloudbase/garm#243) and we have an issue for this (#127) we should agree on the separation level first.
How could the new CRs look like? Maybe we can "design" the CRs first before start implementing 🙏
The text was updated successfully, but these errors were encountered: