Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update logging architecture #2315

Merged
merged 11 commits into from
Oct 24, 2024
1 change: 1 addition & 0 deletions .vale/styles/config/vocabularies/docs/accept.txt
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ node[pP]ool[s]?
NLB[s]?
passthrough
perfectly
Promtail
quickly
randomly
rapidly
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ menu:
main:
identifier: getting-started-observability-logging-architecture
parent: getting-started-observability-logging
principal:
identifier: overview-observability-logging
parent: overview-observability
user_questions:
- What is the logging architecture?
- Why is Giant Swarm using Loki?
Expand All @@ -17,7 +20,7 @@ aliases:
- /getting-started/observability/logging/architecture
owner:
- https://github.com/orgs/giantswarm/teams/team-atlas
last_review_date: 2024-03-21
last_review_date: 2024-10-21
---

Logging is an important pillar of observability and it is thus only natural that Giant Swarm provides and manages a logging solution for operational purposes.
Expand All @@ -35,10 +38,21 @@ In this diagram, you can see that we run the following tools in each management

- `Grafana Loki` that is accessible through our managed Grafana instance.
- `multi-tenant-proxy`, a proxy component used to handle multi-tenancy for Loki.
- A couple of logging agents (`Grafana Promtail` and `Grafana Agent`) that run on the management cluster and your workload clusters alike. We currently need two different tools for different purposes.
- Promtail is used to retrieve the container and kubernetes audit logs
- A couple of scraping agents run on the management cluster and your workload clusters. There are different tools for different purposes:
- Promtail is used in older Giant Swarm releases to retrieve the container and Kubernetes audit logs.
- Alloy is used in newer Giant Swarm releases to retrieve the container and Kubernetes audit logs.
- Grafana Agent is used to retrieve the kubernetes events.

### Release compatibility

Release|Alloy|Promtail|Grafana Agent|
TheoBrigitte marked this conversation as resolved.
Show resolved Hide resolved
-------|-----|--------|-------------|
CAPA from v29.2.0|<i class="fas fa-check"></i>|<i class="fas fa-times"></i>|<i class="fas fa-check"></i>|
CAPZ from v29.1.0|<i class="fas fa-check"></i>|<i class="fas fa-times"></i>|<i class="fas fa-check"></i>|
CAPA before v29.2.0|<i class="fas fa-times"></i>|<i class="fas fa-check"></i>|<i class="fas fa-check"></i>|
CAPZ before v29.1.0|<i class="fas fa-times"></i>|<i class="fas fa-check"></i>|<i class="fas fa-check"></i>|
vintage (all releases)|<i class="fas fa-times"></i>|<i class="fas fa-check"></i>|<i class="fas fa-check"></i>|

If you want to play with Loki, you should definitely check out our guides explaining [how to access Grafana]({{< relref "/tutorials/observability/data-exploration/accessing-grafana" >}}) and how to [explore logs with LogQL]({{< relref "/tutorials/observability/data-exploration/exploring-logs" >}})

## Logs stored by Giant Swarm
Expand All @@ -49,11 +63,11 @@ The logging agents that we have deployed on management and workload clusters cur

- Kubernetes Pod logs from the `kube-system` and `giantswarm` namespaces.
- Kubernetes Events created in the `kube-system` and `giantswarm` namespaces.
- [Kubernetes audit logs]({{< relref "./audit-logs#kubernetes-audit-logs" >}})
- [Kubernetes audit logs]({{< relref "../../../vintage/getting-started/observability/logging/audit-logs#kubernetes-audit-logs" >}})

In the future, we will also store the following logs:

- [Machine (Node) audit logs]({{< relref "./audit-logs#machine-audit-logs" >}})
- [Machine (Node) audit logs]({{< relref "../../../vintage/getting-started/observability/logging/audit-logs#machine-audit-logs" >}})
- Teleport audit logs, tracked in https://github.com/giantswarm/roadmap/issues/3250
- Giant Swarm customer workload logs as part of our observability platform, tracked in https://github.com/giantswarm/roadmap/issues/2771

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ user_questions:
- Why do my clusters run Alloy?
---

By default, Giant Swarm clusters starting from CAPA v29.2.0, and CAPZ v29.1.0 are equipped with [Alloy](https://grafana.com/docs/alloy), an Observability data collector. It's configured to collect system logs from the cluster and forward them to a central [Loki](https://grafana.com/docs/loki) instance running on the management cluster. See [Logging architecture]({{< relref "/vintage/getting-started/observability/logging/architecture" >}}) for more details.
By default, Giant Swarm clusters starting from CAPA v29.2.0, and CAPZ v29.1.0 are equipped with [Alloy](https://grafana.com/docs/alloy), an Observability data collector. It's configured to collect system logs from the cluster and forward them to a central [Loki](https://grafana.com/docs/loki) instance running on the management cluster. See [Logging architecture]({{< relref "../../../../overview/observability/logging/architecture" >}}) for more details.

The observability platform allows to ingest logs from your workloads in a self-service way using [PodLogs][1] to select which pods to ingest logs.

Expand Down
Binary file not shown.
Loading