Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add get agent host tags util function #18269

Merged
merged 22 commits into from
Aug 16, 2024
Merged

Conversation

natashadada
Copy link
Contributor

@natashadada natashadada commented Aug 8, 2024

What does this PR do?

In DataDog/datadog-agent#28027, I added the get_host_tags function. In this PR, I add a util function that will convert these tags into a list of strings.

Motivation

After version 7.51 of the agent, the resolved_hostname is not set to the agent_hostname. This is because for database monitoring, by default we want the host to be the database host not the agent host. But this means that agent host tags are not being propagated to postgresql.* metrics. I'd like to use this function from the postgres integration to propagate tags. This PR just adds this function and doesn't call it anywhere. The next PR will update the postgres integration to call it.

Additional Notes

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • Changelog entries must be created for modifications to shipped code
  • Add the qa/skip-qa label if the PR doesn't need to be tested during QA.
  • If you need to backport this PR to another branch, you can add the backport/<branch-name> label to the PR and it will automatically open a backport PR once this one is merged

Copy link

github-actions bot commented Aug 8, 2024

The validations job has failed; please review the Files changed tab for possible suggestions to resolve.

Copy link

github-actions bot commented Aug 8, 2024

The validations job has failed; please review the Files changed tab for possible suggestions to resolve.

Copy link

codecov bot commented Aug 8, 2024

Codecov Report

Attention: Patch coverage is 94.28571% with 2 lines in your changes missing coverage. Please review.

Project coverage is 89.65%. Comparing base (31ec119) to head (b3dfb27).
Report is 20 commits behind head on master.

Additional details and impacted files
Flag Coverage Δ
activemq 52.80% <ø> (ø)
activemq_xml 82.31% <ø> (ø)
airflow 92.20% <ø> (?)
amazon_msk 88.91% <ø> (ø)
ambari 85.80% <ø> (ø)
apache 95.08% <ø> (ø)
arangodb 98.23% <ø> (ø)
argo_rollouts 90.00% <ø> (ø)
argo_workflows 87.87% <ø> (ø)
aspdotnet 100.00% <ø> (ø)
avi_vantage 91.35% <ø> (ø)
aws_neuron 92.42% <ø> (ø)
azure_iot_edge 82.08% <ø> (ø)
boundary 100.00% <ø> (ø)
btrfs 82.91% <ø> (ø)
cacti 87.90% <ø> (ø)
calico 84.61% <ø> (ø)
cassandra 66.66% <ø> (?)
cert_manager 77.41% <ø> (ø)
cilium 78.20% <ø> (?)
citrix_hypervisor 87.50% <ø> (ø)
cloud_foundry_api 96.11% <ø> (ø)
cloudera 99.51% <ø> (ø)
cockroachdb 93.19% <ø> (ø)
confluent_platform ?
consul 91.82% <ø> (ø)
coredns 94.61% <ø> (ø)
couch 95.67% <ø> (+0.70%) ⬆️
crio 89.79% <ø> (ø)
datadog_checks_base 89.81% <94.28%> (+1.15%) ⬆️
datadog_checks_dev 77.40% <ø> (+0.07%) ⬆️
datadog_checks_downloader 81.35% <ø> (+4.09%) ⬆️
datadog_cluster_agent 90.19% <ø> (ø)
dcgm 92.10% <ø> (ø)
ddev 87.97% <ø> (ø)
directory 95.68% <ø> (+0.43%) ⬆️
disk 89.34% <ø> (-1.43%) ⬇️
dns_check 93.33% <ø> (ø)
druid 97.70% <ø> (ø)
ecs_fargate 83.52% <ø> (ø)
eks_fargate 94.05% <ø> (ø)
envoy 92.97% <ø> (+3.46%) ⬆️
esxi 93.05% <ø> (ø)
etcd 95.56% <ø> (ø)
external_dns 89.28% <ø> (ø)
fluentd 84.32% <ø> (ø)
fluxcd 88.31% <ø> (ø)
fly_io 96.65% <ø> (ø)
foundationdb 83.83% <ø> (ø)
gitlab_runner 92.10% <ø> (ø)
gunicorn 92.07% <ø> (ø)
hazelcast 92.39% <ø> (ø)
hdfs_datanode 89.74% <ø> (ø)
hdfs_namenode 86.72% <ø> (ø)
hive 51.42% <ø> (ø)
hivemq 61.90% <ø> (ø)
http_check 95.32% <ø> (+2.02%) ⬆️
hudi 73.91% <ø> (ø)
ibm_ace 92.25% <ø> (ø)
ibm_db2 86.87% <ø> (ø)
ibm_i 81.91% <ø> (ø)
ibm_mq 91.28% <ø> (ø)
ignite 46.66% <ø> (ø)
impala 97.97% <ø> (ø)
istio 78.14% <ø> (+0.51%) ⬆️
jboss_wildfly 47.36% <ø> (ø)
kafka 64.70% <ø> (ø)
karpenter 94.36% <ø> (ø)
kong 87.62% <ø> (ø)
kube_apiserver_metrics 97.74% <ø> (ø)
kube_controller_manager 97.89% <ø> (ø)
kube_dns 95.97% <ø> (ø)
kube_metrics_server 94.87% <ø> (ø)
kube_proxy 96.80% <ø> (ø)
kube_scheduler 97.92% <ø> (ø)
kubelet 91.01% <ø> (ø)
kubernetes_cluster_autoscaler 93.22% <ø> (ø)
kubernetes_state 89.50% <ø> (ø)
kyototycoon 85.96% <ø> (ø)
kyverno 82.27% <ø> (ø)
lighttpd 83.64% <ø> (ø)
linkerd 85.22% <ø> (+1.13%) ⬆️
linux_proc_extras 96.22% <ø> (ø)
mapr 82.42% <ø> (ø)
mapreduce 82.08% <ø> (ø)
marathon 83.12% <ø> (ø)
mcache 93.50% <ø> (ø)
mesos_master 89.81% <ø> (ø)
nagios 89.01% <ø> (ø)
network 93.64% <ø> (+1.08%) ⬆️
nfsstat 95.20% <ø> (ø)
nginx 95.07% <ø> (+0.53%) ⬆️
nginx_ingress_controller 98.36% <ø> (ø)
nvidia_triton 88.52% <ø> (ø)
openldap 96.33% <ø> (ø)
openmetrics 98.08% <ø> (ø)
openstack 55.19% <ø> (ø)
openstack_controller 94.44% <ø> (?)
pgbouncer 91.35% <ø> (ø)
php_fpm 90.53% <ø> (+0.82%) ⬆️
postfix 88.10% <ø> (ø)
powerdns_recursor 96.65% <ø> (ø)
presto 59.09% <ø> (ø)
prometheus 94.17% <ø> (ø)
proxysql 98.97% <ø> (ø)
pulsar 100.00% <ø> (ø)
rabbitmq 95.37% <ø> (ø)
ray 96.45% <ø> (ø)
rethinkdb 97.93% <ø> (ø)
riak 99.21% <ø> (ø)
riakcs 87.71% <ø> (ø)
silk 93.82% <ø> (ø)
singlestore 90.81% <ø> (ø)
snowflake 96.27% <ø> (ø)
solr 56.25% <ø> (ø)
spark 94.14% <ø> (ø)
squid 100.00% <ø> (ø)
ssh_check 92.54% <ø> (+1.31%) ⬆️
statsd 87.36% <ø> (ø)
strimzi 89.78% <ø> (ø)
supervisord 89.78% <ø> (ø)
system_core 92.66% <ø> (ø)
system_swap 98.30% <ø> (ø)
tcp_check 91.72% <ø> (+1.32%) ⬆️
teamcity 88.10% <ø> (+3.28%) ⬆️
teleport 99.61% <ø> (ø)
temporal 100.00% <ø> (ø)
teradata 94.05% <ø> (ø)
tibco_ems 91.98% <ø> (ø)
tls 92.02% <ø> (+0.86%) ⬆️
tokumx 57.52% <ø> (ø)
tomcat 60.41% <ø> (?)
torchserve 97.32% <ø> (ø)
traefik_mesh 76.75% <ø> (ø)
traffic_server 96.13% <ø> (ø)
twemproxy 79.56% <ø> (ø)
twistlock 80.47% <ø> (ø)
varnish 84.39% <ø> (+0.26%) ⬆️
vllm 93.10% <ø> (ø)
voltdb 96.85% <ø> (ø)
weaviate 76.27% <ø> (ø)
win32_event_log 82.67% <ø> (+1.11%) ⬆️
wmi_check 97.50% <ø> (ø)
yarn 89.52% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Copy link

github-actions bot commented Aug 8, 2024

The validations job has failed; please review the Files changed tab for possible suggestions to resolve.

Copy link

github-actions bot commented Aug 9, 2024

The changelog type changed or removed was used in this Pull Request, so the next release will bump major version. Please make sure this is a breaking change, or use the fixed or added type instead.

@lu-zhengda
Copy link
Contributor

It will be good if we can split this change into 2 PRs, one for datadog_checks_base and one for postgres. This way we can release datadog_checks_base separately.

@natashadada natashadada marked this pull request as draft August 12, 2024 13:35
@natashadada natashadada marked this pull request as ready for review August 12, 2024 14:29
@natashadada natashadada changed the title Propagate agent host tags Add get agent host tags util function Aug 12, 2024
datadog_checks_base/datadog_checks/base/utils/db/utils.py Outdated Show resolved Hide resolved
datadog_checks_base/datadog_checks/base/utils/db/utils.py Outdated Show resolved Hide resolved
if isinstance(value, list):
result.extend(value)
else:
logger.warning("Unable to use tags for key %s since %s is not a list", key, value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log and the other warning log are going directly to the customer logs. As a customer, I would have no idea how to interpret this message, which seems more oriented to developers. Can we rework to be more customer-facing?

tags_dict = json.loads(host_tags) or {}
except Exception as e:
logger.warning("Failed to parse tags: %s", host_tags)
return result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, one more comment/question here. Is failing silently (with only a log) the expected behavior we want here? I'm trying to think about how this will be used so it's hard to say for certain, but I can see this causing some nasty bugs.

Since this is a base package, the contract should be very clear. Will we ever want to handle this error directly in the integrations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a great point. I was going to chat with Zhengda about this tomorrow - I'm not even sure we would hit this case and the experience right now doesn't make sense (would the user have to tell us they see this log and then we'd have to debug it? is it something we could alert on?) I'll talk to Zhengda and Casey tomorrow

@natashadada natashadada requested a review from lu-zhengda August 14, 2024 18:34
@natashadada natashadada merged commit 7d254ec into master Aug 16, 2024
368 of 374 checks passed
@natashadada natashadada deleted the natasha.dada/combine-tags-4 branch August 16, 2024 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants