azure_monitor support for VM scale sets #5819

johncrim · 2019-05-08T19:25:00Z

Currently, the azure_monitor output plugin doesn't correctly resolve Virtual Machine Scaleset Resource IDs for VMs that are part of scale sets. It only works correctly/automatically for single + individually configured Virtual Machines. By default VMs in ScaleSets receive a bunch of 404 errors in the telegraf log, and it's difficult to diagnose the cause.

Proposal:

The resourceIDTemplate unnecessarily constrains the resource ID to non-scaleset VMs:

resourceIDTemplate    = "/subscriptions/%s/resourceGroups/%s/providers/Microsoft.Compute/virtualMachines/%s"

Fixing this template is easy enough - the harder part is adding logic to determine whether the VM is running singly or within a scaleset resource.

Scaleset VMs do have the Instance Metadata service running, and managed service identity works the same. The only difference needed for this plugin to work correctly is for the correct resource ID to be determined, using a template like:

vmssResourceIDTemplate    = "/subscriptions/%s/resourceGroups/%s/providers/Microsoft.Compute/virtualMachineScaleSets/%s"

Current behavior:

Currently, if this setup is performed on a VM in a scaleset:

apt install telegraf -y
telegraf --input-filter cpu:mem:diskio:net --output-filter azure_monitor config > /etc/telegraf/telegraf.conf
systemctl restart telegraf

The telegraf service log starts showing a bunch of 404 errors, though the URL isn't specified.

If the resourceID is manually set to the VM ScaleSet resource Id in telegraf.conf, the telegraf metrics are sent as expected.

Desired behavior:

Installing and configuring the azure_monitor output plugin, as specified (and as documented in the Microsoft and Influx docs) just works.

Use case:

Much of the cloud native use-cases for Azure use VM scalesets (eg Kubernetes or ServiceFabric). Adding this support makes telegraf useable in Azure for VMs that aren't individually configured.

The text was updated successfully, but these errors were encountered:

anildesai61 · 2023-11-14T18:35:58Z

Hello @johncrim

i have ubuntu 18.04 LTS VMSS on azure , below are the steps which i followed:

apt install telegraf -y
telegraf --input-filter cpu:mem:diskio:net --output-filter azure_monitor config > /etc/telegraf/telegraf.conf
systemctl restart telegraf

added a line in /etc/telegraf/telegraf.conf file as:
resoure_id = "/subscriptions/%s/resourceGroups/%s/providers/Microsoft.Compute/virtualMachineScaleSets/%s"

But still telegraf metric not able to visible in the metric option at vmss, can you please help here it will be more helpful.

telegraf status

telegraf.service - Telegraf
Loaded: loaded (/lib/systemd/system/telegraf.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2023-11-14 18:01:09 UTC; 2min 6s ago
Docs: https://github.com/influxdata/telegraf
Main PID: 3505 (telegraf)
Tasks: 7 (limit: 4915)
CGroup: /system.slice/telegraf.service
└─3505 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d

Nov 14 18:01:09 waf-teleg000000 telegraf[3505]: 2023-11-14T18:01:09Z I! Tags enabled: host=waf-teleg000000
Nov 14 18:01:09 waf-teleg000000 systemd[1]: Started Telegraf.
Nov 14 18:01:09 waf-teleg000000 telegraf[3505]: 2023-11-14T18:01:09Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"waf-teleg000000", Flush Interval:10s
Nov 14 18:02:09 waf-teleg000000 telegraf[3505]: 2023-11-14T18:02:09Z E! [agent] Error writing to outputs.azure_monitor: unable to fetch authentication credentials: azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://CentralIndia.monitoring.azure.com/subscriptions/xxxx-xx-xx-xx-xxx-xxx-xxx-xxx-xxxx/resourceGroups/TELEGRAPH/providers/Microsoft.Compute/virtualMachineScaleSets/waf-telegraph1/metrics: StatusCode=400 -- Original Error: adal: Refresh request failed. Status Code = '400'. Response body: {"error":"invalid_request","error_description":"Identity not found"} Endpoint http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fmonitoring.azure.com%2F

Telegraf config file.

#Send aggregate metrics to Azure Monitor
[[outputs.azure_monitor]]
Timeout for HTTP writes.
timeout = "20s"

#Set the namespace prefix, defaults to "Telegraf/".
namespace_prefix = "Telegraf/Apache"

#Azure Monitor doesn't have a string value type, so convert string
#fields to dimensions (a.k.a. tags) if enabled. Azure Monitor allows
#a maximum of 10 dimensions so Telegraf will only send the first 10
#alphanumeric dimensions.
strings_as_dimensions = false

#Both region and resource_id must be set or be available via the
#Instance Metadata service on Azure Virtual Machines.
#Azure Region to publish metrics against
region = "centralindia"

The Azure Resource ID against which metric will be logged, e.g.
resource_id = "/subscriptions/xxx-xx-x-xxxxx-xxx-xx/resourceGroups/TELEGRAPH/providers/Microsoft.Compute/virtualMachineScaleSets/waf-telegraph1"

Please share any example telegraf config file, i wanted to achieve based on the Apache requests in telegraf metric wanted to scale up vmss

danielnelson self-assigned this May 8, 2019

danielnelson added area/azure Azure plugins including eventhub_consumer, azure_storage_queue, azure_monitor bug unexpected problem or unintended behavior labels May 8, 2019

danielnelson mentioned this issue May 8, 2019

Fix resource id in virtual machine scale sets with azure_monitor output #5821

Merged

3 tasks

danielnelson closed this as completed in #5821 May 20, 2019

danielnelson added this to the 1.11.0 milestone May 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

azure_monitor support for VM scale sets #5819

azure_monitor support for VM scale sets #5819

johncrim commented May 8, 2019 •

edited

Loading

anildesai61 commented Nov 14, 2023 •

edited

Loading

azure_monitor support for VM scale sets #5819

azure_monitor support for VM scale sets #5819

Comments

johncrim commented May 8, 2019 • edited Loading

Proposal:

Current behavior:

Desired behavior:

Use case:

anildesai61 commented Nov 14, 2023 • edited Loading

johncrim commented May 8, 2019 •

edited

Loading

anildesai61 commented Nov 14, 2023 •

edited

Loading