-
Notifications
You must be signed in to change notification settings - Fork 46
Move to config file for configuration #19
Comments
I get your concern I think this is a good approach but I have a few concerns:
As an example, currently we are sharing one cluster across different teams where this could be a problem in terms of governance. That said, I like the idea of having certain parts configured in the core such as default subscription and default metrics. Extended proposalPersonally I think you're idea is perfect but that it needs to be taken a step further so that you can deploy the adapter with its configuration, but that it's a simple pointer to more detailed metrics definition for a specific namespace or "metric group". This might sound a bit of overkill but provides more flexibility towards the consumer and less overhead for the admins. HPA ConfigurationSimplified down to the name of the metric on which you'd like to scale and in which metrics group it belongs: apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: consumer-scaler
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: consumer
minReplicas: 1
maxReplicas: 10
metrics:
- type: External
external:
metricName: queuemessages
metricSelector:
matchLabels:
metricName: order-queue-size
metricsGroup: order-service-metrics
targetValue: 30 Adapter ConfigurationThe adapter then gets the correct configuration based on the following config: # List of all metric groups which are defined in the cluster
# Allows you to configure one or more metric groups from the same namespace but with a different name
metricGroups:
- name: order-service-metrics
namespace: order-service
config-map: order-metrics-declaration
- name: order-service-custom-metrics
namespace: order-service
config-map: custom-order-metrics-declaration
- name: warehouse-monitor-metrics
namespace: warehouse-service
config-map: warehouse-monitor-metrics Caveat - This way of linking to a config-map is probably not ideal but just an indication of the approach Metric Group ConfigurationConfiguration map in the mentioned namespace is then the spec that you've proposed: #list all external metrics values are obtained from https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-supported-metrics
external:
- metric: queuemessages #this is the name that will be referenced by hpa
metricName: Messages #azure name - is Case sensitive
resourceGroup: sb-external-example
resourceName: sb-external-ns
resourceProviderNamespace: Microsoft.Servicebus #this can container slashes (/)
resourceType: namespaces
aggregation: Total
filter: EntityName eq 'externalq' #any valid odata filter
subscriptionID: 12345678-1234-1234-1234-12345678901234 #optional (override global)
#list all custom metrics
custom:
- metric: rps # this is the name that will be refrenced by hpa
metricName: performanceCounters/requestsPerSecond # azure name which containers slashes
applicationId: 12345678-1234-1234-1234-123456789012 #optional (override global)
# use with ai queries
- metric: requests # this is the name that will be refrenced by hpa
# AI query as defined at https://docs.loganalytics.io/docs/Language-Reference/Query-statements
query: requests | where timestamp > ago(30d) and client_City == "Redmond" | summarize clients = dcount(client_IP) by tod_UTC=bin(timestamp % 1d, 1h), resultCode | extend local_hour = (tod_UTC - 8h) % 24h
applicationId: 12345678-1234-1234-1234-123456789012 #optional (override global) Note - I think this can be improved a bit in terms of readability but looks good. Only thing I would do is provide a bit more structure in what is related to what, a bit similar to how I currently do it for Promitor SecurityWhen we go this route this also means that we can take security a step further and configure the authentication in the "metrics group" configuration so that multiple service principles can be used for authentication. Visual Flow |
Was thinking about my proposal again and I'm not really sure if it's even possible to get those referenced ConfigMaps, do you know? Instead of using the name it would probably be better to use a label selector - Either with specified labels or built-in ones which we enforce (which is less nice). |
It would be possible to pull the config maps for those referenced files via k8 api. The challenge would come that it would add over head in terms of configuring RBAC on those resources and the extra over head of configuration. I do like the added benefits (not having one bad actor fail entire adapter and management separation of configs. Another thing to consider is how to restrict access to metrics across the different name spaces. There are two scenarios to consider:
|
An idea is that we could handle this via the "Adapter Configuration". If my HPA mentioned these selectors: metricSelector:
matchLabels:
metricName: order-queue-size
metricsGroup: order-service-metrics You can see to which namespace it's assigned and check if it matches: metricGroups:
- name: order-service-metrics
namespace: order-service
config-map: order-metrics-declaration
- name: order-service-custom-metrics
namespace: order-service
config-map: custom-order-metrics-declaration
- name: warehouse-monitor-metrics
namespace: warehouse-service
config-map: warehouse-monitor-metrics |
That would partially solve the issue. If we look at two teams working in the same cluster in different namespaces, and even across different Azure subscriptions, we want to make sure team 1 couldn't access metrics from team 2, which would still be possible if we only had the check above. External MetricsI've been thinking about how to to do this, especially in the case of external metrics. In a metrics system where you can tag metrics, you would be able to use the namespace value and ensure that a request from an HPA within namespace would only get values tagged with that namespace. In the case of Azure external metrics coming from Azure Monitor, we don't have the ability to tag the metric, but we can tag the resource in Azure. If we tag the resource then the adapter can use the namespace and check that the resource is assigned to the "namespace" before returning the value. Since each team could effectively own each resource (team 1 doesn't have access to team 2's subscription) this resolves the access issue. Tags can easily be applied to a resource: jsonrtag=$(az resource show -g examplegroup -n examplevnet --resource-type "Microsoft.Network/virtualNetworks" --query tags)
rt=$(echo $jsonrtag | tr -d '"{},' | sed 's/: /=/g')
az resource tag --tags $rt namespace=team2 -g examplegroup -n examplevnet --resource-type "Microsoft.Network/virtualNetworks" Custom MetricsIn Application metrics, we can have properties for each metric that can be filtered on, so we can add kubernetes metadata to the properties and filter based on namespace. The challenge is when we look at existing AI SDKs, only the C# library has an addon to add the metadata currently. For now, we can instruct users to add the metadata via property initializers. It would be a required step to use the adapter. Root scoped metricsEnabling these checks will help a customer set up a secure environment by default. The use of root scope metrics and relaxing the default policy on the HPA (which will be scoped to namespaced metrics) will allow for accessing the metrics without namespaces if needed/desired. This will require the cluster admin to make a explicit decsion. |
Metric GroupsBased on the conversation so far, I think the configuration file will have the following format:
This will allow for quick configuration for small teams yet give the option for metrics groups for teams that need the flexibilty. We will not dynamically load the config map through an api but instead rely on the deployment to be configured with the aditional config map. The adapter will periodically look for updates and read the config map, if the new configmap is invalid it will rely on the older configuration and will not fail the entire adapter. |
I like the last suggestion for the configuration file! However, I'm a bit concerned about tagging resources and using the custom properties in Application Insights for filtering metrics as this requires (significant) setup just to get metrics. Is there a reason why we can't restrict metric groups for all namespaces expect for them mentioned one? We know where the HPA is running and to what namespace the metrics group is restricted so in theory we know if they are allowed to scrape it, no? If this is about the adapter scraping subscriptions for a group to which it is not allowed we might consider moving the authentication information to the namespace as well and introduce Sample: # optional
metricGroups:
- name: order-service-metrics
namespace: order-service
config-map: order-metrics-declaration #contains definitions for external/custom metrics as defined
authentication: order-metrics-auth-secret #contains authentication secrets for accessing Azure Monitor
# optional
external:
- metric: queuemessages #this is the name that will be referenced by hpa
# The metric properties described above
# optional
custom:
- metric: rps # this is the name that will be refrenced by hpa
# metric properties described above |
@tomkerkhove I think your right about the additional overhead for external metrics. When it comes to App Insights I think having the extra meta data will be come important in the long run any ways. If you use a single app insights instance for more than one app you will need to add that meta data to be able to filter by application. In addition I could see the need for getting detailed metric across a given application and filtering by pod to find a mis-behaving instance. But that is something we can easily handle through the configuration. Adding the authentication to the metric groups will give that added check I was concerned about. Unless anyone else has concerns or input we are good on the final design and can iterate on it as it's developed. Thanks for all the input! |
Currently all of the metadata required to pull the metrics from Azure are hosted on the HPA, for example on the external metrics HPA:
This creates overhead for the end user especially if they were to try to map Odata queries to label selectors as I previously suggested.
Another interesting result of this the conversion is when using metrics names. With custom metrics the end user must convert the Application Insights metric name from
performanceCounters/requestsPerSecond
toperformanceCounters-requestsPerSecond
replacing the/
to a-
.With external metrics the "metric name" is not used at all and instead pulled from the selector. This is becuase the metricname in the URL is lower cased and Azure monitor is case sensitive the the metric names, for example Service Bus
Messages
cannot bemessages
which happens when the Metric name is pass via url via hpa.Proposal
To create a better experience for the developer I propose we use a config file that is load via configmap to hold this meta data. The Prometheus Adapter uses the idea of a config though I believe we don't need it to be as complex (yet). This would allow a few scenarios:
To help even further it would be possible to provide tooling that would auto generate this config file based on the service principal that has access to azure resources.
Example config
Feedback
Please provide any feedback you might have an the above proposal. Thanks!
The text was updated successfully, but these errors were encountered: