configure_azure_monitor() takes abnormally long time #34902
Labels
Client
This issue points to a problem in the data-plane of the library.
customer-reported
Issues that are reported by GitHub users external to the Azure organization.
feature-request
This issue requires a new behavior in the product in order be resolved.
Monitor - Exporter
Monitor OpenTelemetry Exporter
needs-team-attention
Workflow: This issue needs attention from Azure service team or SDK team
question
The issue doesn't require a change to the product in order to be resolved. Most issues start as that
Service Attention
Workflow: This issue is responsible by Azure service team.
Describe the bug
With default configuration (absence of configuration), time of execution of
configure_azure_monitor()
takes abnormally long time: ~10 seconds.To Reproduce
long.py:
APPLICATIONINSIGHTS_CONNECTION_STRING="..." python long.py
Expected behavior
Reasonable time to configure (< 1 s).
Additional context
After running in debugger I discovered two main code places contributing to the delay, and both are related to checking the fact of running in an Azure VM.
Location: https://github.com/open-telemetry/opentelemetry-python-contrib/blob/37aba928d45713842941c7efc992726a79ea7d8a/resource/opentelemetry-resource-detector-azure/src/opentelemetry/resource/detector/azure/vm.py#L77
The way code gets there:
Then in https://github.com/open-telemetry/opentelemetry-python-contrib/blob/main/resource/opentelemetry-resource-detector-azure/src/opentelemetry/resource/detector/azure/vm.py
2. Statsbeat metrics
Location:
azure-sdk-for-python/sdk/monitor/azure-monitor-opentelemetry-exporter/azure/monitor/opentelemetry/exporter/statsbeat/_statsbeat_metrics.py
Lines 212 to 215 in a9b8513
Call stack:
In both cases the delay is related to requests to this endpoint:
http://169.254.169.254/metadata/instance/compute
though, to different API versions. The first place has request timeout of 4 seconds, and the second place has 5 seconds, which together constitute almost the entire time of the startup delay.
Workarounds
OTEL_EXPERIMENTAL_RESOURCE_DETECTORS=otel
environment variable. If not set, the library sets the default value, that includes App Service and Azure VM.APPLICATIONINSIGHTS_STATSBEAT_DISABLED_ALL=TRUE
The above tweaks bring the configuration time down to ~0.8 s (and with
OTEL_PYTHON_DISABLED_INSTRUMENTATIONS
set toazure_sdk,django,fastapi,flask,psycopg2,requests,urllib,urllib3
it completes under 30 ms).It took me hours to find the above options for fixing the startup time without touching the code. I think we need to make the library friendlier to running in non-Azure environments.
The text was updated successfully, but these errors were encountered: