Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cswatt/operator2 #8494

Merged
merged 28 commits into from
Oct 2, 2020
Merged

Cswatt/operator2 #8494

merged 28 commits into from
Oct 2, 2020

Conversation

cswatt
Copy link
Contributor

@cswatt cswatt commented Sep 9, 2020

What does this PR do?

Motivation

Preview link

https://docs-staging.datadoghq.com/cswatt/operator2/agent/kubernetes

Check preview base path using the URL in details in preview status check.

Additional Notes

@cswatt cswatt added WORK IN PROGRESS No review needed, it's a wip ;) Do Not Merge Just do not merge this PR :) labels Sep 9, 2020
@cswatt cswatt requested review from a team as code owners September 9, 2020 18:42
@github-actions github-actions bot added agent Content changed in the Agent folder Architecture Everything related to the Doc backend labels Sep 10, 2020
datadog-agent-k26tp 1/1 Running 0 5m59s 10.244.2.13 kind-worker
datadog-agent-zcxx7 1/1 Running 0 5m59s 10.244.1.7 kind-worker2
```
### Tolerations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is a bit confusing to me. I think we should move it out from this page to an advanced topic, and also explain why this is needed, and what is the configuration that the user needs to add (the current yaml snippet looks like it's replacing the entire DatadogAgent manifest that was previously downloaded, as opposed to adding the tolerations field only)

[8]: https://github.com/DataDog/datadog-operator/blob/master/examples/datadog-agent-apm.yaml
[9]: https://github.com/DataDog/datadog-operator/blob/master/examples/datadog-agent-with-clusteragent.yaml
[10]: https://github.com/DataDog/datadog-operator/blob/master/examples/datadog-agent-with-tolerations.yaml
[11]: https://app.datadoghq.com/account/settings#api
{{% /tab %}}
{{< /tabs >}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add the missing Operator tab in the Events Collection section below?

- [`Helm`][2] for deploying the `datadog-operator`.
- [`Kubectl` CLI][3] for installing the `datadog-agent`.

## Deploy the Datadog Operator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a very important goal in mind - to make the Operator the recommended way to deploy the Datadog Agent (over Helm and DaemonSet). In order to get there, one of the key requirements is to make the Operator easier and quickest to deploy. There are too many steps below that are true, but provide too much information than a quick "getting started" guide. We actually designed and implemented the quick start path that is documented here: https://github.com/DataDog/datadog-operator/blob/master/docs/getting_started.md#deploy-an-agent-with-the-operator

My suggestion is to use the steps there to deploy the Operator+Agent as in few steps as possible, and move the more advanced steps that offer additional customization to a different location in the page or another page

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can move the verbose steps below to the first part of the Operator Configurations page, and add a link after the quick start steps in this page with a link called "additional/advanced configuration", wdyt?

$ kubectl apply -n $DD_NAMESPACE -f datadog-agent.yaml
```

[1]: https://github.com/DataDog/datadog-operator/blob/master/examples/datadog-agent-apm.yaml
{{% /tab %}}
{{< /tabs >}}
**Note**: On minikube, you may receive an `Unable to detect the kubelet URL automatically` error. In this case, set `DD_KUBELET_TLS_VERIFY=false`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Down below in the APM page, there's a table called "Agent Environment Variables" which should be replaced by the corresponding options that the DatadogAgent CRD provides. The reason is that env vars cannot be provided to the Agent directly, since it is created later by the Operator (in run-time)

- repo_name: datadog-operator
contents:

- action: pull-and-push-file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the configuration options table in https://docs-staging.datadoghq.com/cswatt/operator2/agent/kubernetes/operator_configuration/ doesn't resize well and leaves a lot of empty space in the first column. Can set the width of the first column to 25% of the view width, and break the option name into multiple lines?

@github-actions github-actions bot added the FAQ Content impacting a FAQ label Sep 18, 2020
@github-actions github-actions bot added the Guide Content impacting a guide label Sep 18, 2020

To use the Datadog Operator, deploy it in your Kubernetes cluster. Then create a `DatadogAgent` Kubernetes resource that contains the Datadog deployment configuration:

1. Download the [Datadog Operator project zip ball][4]. Source code can be found at [`DataDog/datadog-operator`][1].
Copy link
Contributor

@cohenyair cohenyair Sep 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ceciliawatt the operator is now available on Helm, so we should replace steps (1) and (2) steps with:

  1. helm repo add datadog https://helm.datadoghq.com
  2. helm install datadog/datadog-operator --version 0.1.0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the version is not mandatory. if they want to always install the latest.
So I think it is better to not document: --version 0.1.0

@cswatt cswatt removed Do Not Merge Just do not merge this PR :) WORK IN PROGRESS No review needed, it's a wip ;) labels Sep 30, 2020
Copy link
Contributor

@apigirl apigirl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first round of review

tag: 'Documentation'
text: 'Datadog and Kubernetes'
---

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add in the traditional beta disclaimer

@@ -174,11 +174,64 @@ To install the Datadog Agent on your Kubernetes cluster:
{{% /tab %}}
{{% tab "Operator" %}}

[The Datadog Operator][1] is in public beta. The Datadog Operator is a way to deploy the Datadog Agent on Kubernetes and OpenShift. It reports deployment status, health, and errors in its Custom Resource status, and it limits the risk of misconfiguration thanks to higher-level configuration options. To get started, check out the [Getting Started page][2] in the [Datadog Operator repo][1] or install the operator from the [OperatorHub.io Datadog Operator page][3].
[The Datadog Operator][1] is in public beta. The Datadog Operator is a way to deploy the Datadog Agent on Kubernetes and OpenShift. It reports deployment status, health, and errors in its Custom Resource status, and it limits the risk of misconfiguration thanks to higher-level configuration options.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same beta call out feedback

content/en/agent/kubernetes/operator_configuration.md Outdated Show resolved Hide resolved
### Operator environment variables
| Environment variable | Description |
| -------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `agent.apm.enabled` | Enable this to enable APM and tracing, on port 8126 ref: https://github.com/DataDog/docker-dd-agent#tracing-from-the-host |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should make these actual page level links with the "For a reference, see..." because how this is doesn't actually make it a clickable url in the docs - someone would have to copy paste it

|--------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| agent.additionalAnnotations | AdditionalAnnotations provide annotations that will be added to the Agent Pods. |
| agent.additionalLabels | AdditionalLabels provide labels that will be added to the cluster checks runner Pods. |
| agent.apm.enabled | Enable this to enable APM and tracing, on port 8126 ref: https://github.com/DataDog/docker-dd-agent#tracing-from-the-host |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment about making these actual clickable URLs. You also use periods inconsistently in this table too.

| agent.deploymentStrategy.canary.replicas | |
| agent.deploymentStrategy.reconcileFrequency | The reconcile frequency of the ExtendDaemonSet |
| agent.deploymentStrategy.rollingUpdate.maxParallelPodCreation | The maxium number of pods created in parallel. Default value is 250. |
| agent.deploymentStrategy.rollingUpdate.maxPodSchedulerFailure | MaxPodSchedulerFailure the maxinum number of not scheduled on its Node due to a scheduler failure: resource constraints. Value can be an absolute number (ex: 5) or a percentage of total number of DaemonSet pods at the start of the update (ex: 10%). Absolute |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs rephrased - what does it mean to be the maximum number of not scheduled?

| agent.log.enabled | Enables this to activate Datadog Agent log collection. ref: https://docs.datadoghq.com/agent/basic_agent_usage/kubernetes/#log-collection-setup |
| agent.log.logsConfigContainerCollectAll | Enable this to allow log collection for all containers. ref: https://docs.datadoghq.com/agent/basic_agent_usage/kubernetes/#log-collection-setup |
| agent.log.openFilesLimit | Set the maximum number of logs files that the Datadog Agent will tail up to. Increasing this limit can increase resource consumption of the Agent. ref: https://docs.datadoghq.com/agent/basic_agent_usage/kubernetes/#log-collection-setup Default to 100 |
| agent.log.podLogsPath | This to allow log collection from pod log path. Default to /var/log/pods |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| agent.log.podLogsPath | This to allow log collection from pod log path. Default to /var/log/pods |
| agent.log.podLogsPath | Set this to allow log collection from pod log path. Default to `/var/log/pods`. |

| agent.log.logsConfigContainerCollectAll | Enable this to allow log collection for all containers. ref: https://docs.datadoghq.com/agent/basic_agent_usage/kubernetes/#log-collection-setup |
| agent.log.openFilesLimit | Set the maximum number of logs files that the Datadog Agent will tail up to. Increasing this limit can increase resource consumption of the Agent. ref: https://docs.datadoghq.com/agent/basic_agent_usage/kubernetes/#log-collection-setup Default to 100 |
| agent.log.podLogsPath | This to allow log collection from pod log path. Default to /var/log/pods |
| agent.log.tempStoragePath | This path (always mounted from the host) is used by Datadog Agent to store information about processed log files. If the Datadog Agent is restarted, it allows to start tailing the log files from the right offset Default to /var/lib/datadog-agent/logs |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| agent.log.tempStoragePath | This path (always mounted from the host) is used by Datadog Agent to store information about processed log files. If the Datadog Agent is restarted, it allows to start tailing the log files from the right offset Default to /var/lib/datadog-agent/logs |
| agent.log.tempStoragePath | This path (always mounted from the host) is used by the Datadog Agent to store information about processed log files. If the Datadog Agent is restarted, it allows to start tailing the log files from the right offset Default to `/var/lib/datadog-agent/logs` |

| agent.process.resources.requests | Requests describes the minimum amount of compute resources required. If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, otherwise to an implementation-defined value. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/ |
| agent.rbac.create | Used to configure RBAC resources creation |
| agent.rbac.serviceAccountName | Used to set up the service account name to use Ignored if the field Create is true |
| agent.systemProbe.appArmorProfileName | AppArmorProfileName specify a apparmor profile |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| agent.systemProbe.appArmorProfileName | AppArmorProfileName specify a apparmor profile |
| agent.systemProbe.appArmorProfileName | Use `AppArmorProfileName` to specify a `apparmor` profile. |

| agent.rbac.create | Used to configure RBAC resources creation |
| agent.rbac.serviceAccountName | Used to set up the service account name to use Ignored if the field Create is true |
| agent.systemProbe.appArmorProfileName | AppArmorProfileName specify a apparmor profile |
| agent.systemProbe.bpfDebugEnabled | BPFDebugEnabled logging for kernel debug |
Copy link
Contributor

@apigirl apigirl Sep 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| agent.systemProbe.bpfDebugEnabled | BPFDebugEnabled logging for kernel debug |
| agent.systemProbe.bpfDebugEnabled | Use `BPFDebugEnabled` logging to debug your kernel. |

Copy link
Contributor

@apigirl apigirl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a few more small things!

content/en/agent/kubernetes/operator_configuration.md Outdated Show resolved Hide resolved
content/en/agent/kubernetes/operator_configuration.md Outdated Show resolved Hide resolved
content/en/agent/kubernetes/operator_configuration.md Outdated Show resolved Hide resolved
content/en/agent/kubernetes/operator_configuration.md Outdated Show resolved Hide resolved
content/en/agent/kubernetes/operator_configuration.md Outdated Show resolved Hide resolved
content/en/agent/kubernetes/operator_configuration.md Outdated Show resolved Hide resolved
content/en/agent/kubernetes/operator_configuration.md Outdated Show resolved Hide resolved
content/en/agent/kubernetes/operator_configuration.md Outdated Show resolved Hide resolved
content/en/agent/kubernetes/operator_configuration.md Outdated Show resolved Hide resolved
content/en/agent/kubernetes/operator_configuration.md Outdated Show resolved Hide resolved
@cswatt cswatt merged commit 62a91d3 into master Oct 2, 2020
@cswatt cswatt deleted the cswatt/operator2 branch October 2, 2020 00:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent Content changed in the Agent folder Architecture Everything related to the Doc backend FAQ Content impacting a FAQ Guide Content impacting a guide
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants