From 8b4578bf3ad32f233456d6bb110ff36ff3423b95 Mon Sep 17 00:00:00 2001 From: Russ Cam Date: Thu, 11 Mar 2021 22:45:19 +1000 Subject: [PATCH] [META 388] Proposal for collecting Azure App Service cloud metadata (#365) This commit updates the cloud metadata spec to propose collecting Azure App Service metadata. Azure App Services are a very popular PaaS offering on Azure. In contrast to Azure VMs, App Services do not have access to the internal metadata endpoint in order to retrieve metadata about the app instance. Instead, much of the metadata that is relevant to the purposes of APM are available in environment variables. The environment variables of interest are documented here. --- specs/agents/metadata.md | 106 ++++++++++++++++++ .../azure_app_service_metadata.feature | 71 ++++++++++++ 2 files changed, 177 insertions(+) create mode 100644 tests/agents/gherkin-specs/azure_app_service_metadata.feature diff --git a/specs/agents/metadata.md b/specs/agents/metadata.md index 5748e5b6..f45ef6e1 100644 --- a/specs/agents/metadata.md +++ b/specs/agents/metadata.md @@ -99,6 +99,112 @@ metadata is available. A sample implementation of this metadata collection is available in [the Python agent](https://github.com/elastic/apm-agent-python/blob/master/elasticapm/utils/cloud.py). +#### AWS metadata + +[Metadata about an EC2 instance](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html) can be retrieved from the internal metadata endpoint, `http://169.254.169.254`. + +As an example with curl, first, an API token must be created + +```sh +TOKEN=`curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 300"` +``` + +Then, metadata can be retrieved, passing the API token + +```sh +curl -H "X-aws-ec2-metadata-token: $TOKEN" -v http://169.254.169.254/latest/meta-data +``` + +From the returned metadata, the following fields are useful + +| Cloud metadata field | AWS Metadata field | +| -------------------- | ------------------- | +| `account.id` | `accountId` | +| `instance.id` | `instanceId` | +| `availability_zone` | `availabilityZone` | +| `machine.type` | `instanceType` | +| `provider` | aws | +| `region` | `region` | + +#### GCP metadata + +Metadata about a GCP machine instance can be retrieved from the +metadata service, `http://metadata.google.internal`. + +An example with curl + +```sh +curl -X GET "http://metadata.google.internal/computeMetadata/v1/?recursive=true" -H "Metadata-Flavor: Google" +``` + +From the returned metadata, the following fields are useful + +| Cloud metadata field | AWS Metadata field | +| -------------------- | ------------------- | +| `instance.id` | `instance.id` | +| `instance.name` | `instance.name` | +| `project.id` | `project.numericProjectId` as a string | +| `project.name` | `project.projectId` | +| `availability_zone` | last part of `instance.zone`, split by `/` | +| `machine.type` | last part of `instance.machineType`, split by `/` | +| `provider` | gcp | +| `region` | last part of `instance.zone`, split by `-` | + +#### Azure metadata + +##### Azure VMs + +Metadata about an Azure VM can be retrieved from the internal metadata +endpoint, `http://169.254.169.254`. + +An example with curl + +```sh +curl -X GET "http://169.254.169.254/metadata/instance/compute?api-version=2019-08-15" -H "Metadata: true" +``` + +From the returned metadata, the following fields are useful + +| Cloud metadata field | AWS Metadata field | +| -------------------- | ------------------- | +| `account.id` | `subscriptionId` | +| `instance.id` | `vmId` | +| `instance.name` | `name` | +| `project.name` | `resourceGroupName` | +| `availability_zone` | `zone` | +| `machine.type` | `vmSize` | +| `provider` | azure | +| `region` | `location` | + +##### Azure App Services _(Optional)_ + +Azure App Services are a PaaS offering within Azure which does not +have access to the internal metadata endpoint. Metadata about +an App Service can however be retrieved from environment variables + + +| Cloud metadata field | Environment variable | +| -------------------- | ------------------- | +| `account.id` | first part of `WEBSITE_OWNER_NAME`, split by `+` | +| `instance.id` | `WEBSITE_INSTANCE_ID` | +| `instance.name` | `WEBSITE_SITE_NAME` | +| `project.name` | `WEBSITE_RESOURCE_GROUP` | +| `provider` | azure | +| `region` | last part of `WEBSITE_OWNER_NAME`, split by `-`, trim end `"webspace"` and anything following | + +The environment variable `WEBSITE_OWNER_NAME` has the form + +``` +{subscription id}+{app service plan resource group}-{region}webspace{.*} +``` + +an example of which is `f5940f10-2e30-3e4d-a259-63451ba6dae4+elastic-apm-AustraliaEastwebspace` + +Cloud metadata for Azure App Services is optional; it is up +to each agent to determine whether it is useful to implement +for their language ecosystem. See [azure_app_service_metadata specs](../../tests/agents/gherkin-specs/azure_app_service_metadata.feature) +for scenarios and expected outcomes. + ### Global labels Events sent by the agents can have labels associated, which may be useful for custom aggregations, or document-level access control. It is possible to add "global labels" to the metadata, which are labels that will be applied to all events sent by an agent. These are only understood by APM Server 7.2 or greater. diff --git a/tests/agents/gherkin-specs/azure_app_service_metadata.feature b/tests/agents/gherkin-specs/azure_app_service_metadata.feature new file mode 100644 index 00000000..3149e871 --- /dev/null +++ b/tests/agents/gherkin-specs/azure_app_service_metadata.feature @@ -0,0 +1,71 @@ +Feature: Extracting Metadata for Azure App Service + + Background: + Given an instrumented application is configured to collect cloud provider metadata for azure + + Scenario Outline: Azure App Service with all environment variables present in expected format + Given the following environment variables are present + | name | value | + | WEBSITE_OWNER_NAME | | + | WEBSITE_RESOURCE_GROUP | resource_group | + | WEBSITE_SITE_NAME | site_name | + | WEBSITE_INSTANCE_ID | instance_id | + When cloud metadata is collected + Then cloud metadata is not null + And cloud metadata 'account.id' is 'f5940f10-2e30-3e4d-a259-63451ba6dae4' + And cloud metadata 'provider' is 'azure' + And cloud metadata 'instance.id' is 'instance_id' + And cloud metadata 'instance.name' is 'site_name' + And cloud metadata 'project.name' is 'resource_group' + And cloud metadata 'region' is 'AustraliaEast' + Examples: + | WEBSITE_OWNER_NAME | + | f5940f10-2e30-3e4d-a259-63451ba6dae4+elastic-apm-AustraliaEastwebspace | + | f5940f10-2e30-3e4d-a259-63451ba6dae4+appsvc_linux_australiaeast-AustraliaEastwebspace-Linux | + + # WEBSITE_OWNER_NAME is expected to include a + character + Scenario: WEBSITE_OWNER_NAME environment variable not expected format + Given the following environment variables are present + | name | value | + | WEBSITE_OWNER_NAME | f5940f10-2e30-3e4d-a259-63451ba6dae4-elastic-apm-AustraliaEastwebspace | + | WEBSITE_RESOURCE_GROUP | resource_group | + | WEBSITE_SITE_NAME | site_name | + | WEBSITE_INSTANCE_ID | instance_id | + When cloud metadata is collected + Then cloud metadata is null + + Scenario: Missing WEBSITE_OWNER_NAME environment variable + Given the following environment variables are present + | name | value | + | WEBSITE_RESOURCE_GROUP | resource_group | + | WEBSITE_SITE_NAME | site_name | + | WEBSITE_INSTANCE_ID | instance_id | + When cloud metadata is collected + Then cloud metadata is null + + Scenario: Missing WEBSITE_RESOURCE_GROUP environment variable + Given the following environment variables are present + | name | value | + | WEBSITE_OWNER_NAME | f5940f10-2e30-3e4d-a259-63451ba6dae4+elastic-apm-AustraliaEastwebspace | + | WEBSITE_SITE_NAME | site_name | + | WEBSITE_INSTANCE_ID | instance_id | + When cloud metadata is collected + Then cloud metadata is null + + Scenario: Missing WEBSITE_SITE_NAME environment variable + Given the following environment variables are present + | name | value | + | WEBSITE_OWNER_NAME | f5940f10-2e30-3e4d-a259-63451ba6dae4+elastic-apm-AustraliaEastwebspace | + | WEBSITE_RESOURCE_GROUP | resource_group | + | WEBSITE_INSTANCE_ID | instance_id | + When cloud metadata is collected + Then cloud metadata is null + + Scenario: Missing WEBSITE_INSTANCE_ID environment variable + Given the following environment variables are present + | name | value | + | WEBSITE_OWNER_NAME | f5940f10-2e30-3e4d-a259-63451ba6dae4+elastic-apm-AustraliaEastwebspace | + | WEBSITE_RESOURCE_GROUP | resource_group | + | WEBSITE_SITE_NAME | site_name | + When cloud metadata is collected + Then cloud metadata is null \ No newline at end of file