You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanos v0.34.0
(Deployed through Bitnami Helm chart)
Object Storage Provider:
Azure Storage Account
What happened:
After upgrading from v0.33.0 to v0.34.0 the components connecting to Azure Storage (compactor & store gateway) using Workload Identity can't connect. And I get the error below, downgrading back to v0.33.0 and everything works as expected again.
When checking the Service Account which is connected to both components, I do see the client-id. Annotations: azure.workload.identity/client-id: xxxxxx
But only on deploying the v0.33.0 version of Thanos, does this environment variable get set in the pods
On v0.34.0 the environment variable is empty. Which I assume is causing the issue.
Full logs to relevant components:
Log from the Store Gateway
❯ kl thanos-storegateway-0 -n monitoring
ts=2024-02-14T13:29:52.742610403Z caller=factory.go:53 level=info msg="loading bucket configuration"
ts=2024-02-14T13:29:53.011149871Z caller=main.go:135 level=error err="DefaultAzureCredential authentication failed
POST https://login.microsoftonline.com/80525e01-c1a1-4824-9b24-acd53a540aa8/oauth2/v2.0/token
RESPONSE 400 Bad Request
{
\"error\": \"unauthorized_client\",
\"error_description\": \"AADSTS700016: Application with identifier '####' was not found in the directory '####'. This can happen if the application has not been installed by the administrator of the tenant or consented to by any user in the tenant. You may have sent your authentication request to the wrong tenant. Trace ID: 26a769f4-c39c-4652-bfcb-ddca8bf20c00 Correlation ID: 2bebca23-6f45-4439-887a-b7857bbac661 Timestamp: 2024-02-14 13:29:52Z\",
\"error_codes\": [
700016
],
\"timestamp\": \"2024-02-14 13:29:52Z\",
\"trace_id\": \"26a769f4-c39c-4652-bfcb-ddca8bf20c00\",
\"correlation_id\": \"2bebca23-6f45-4439-887a-b7857bbac661\",
\"error_uri\": \"https://login.microsoftonline.com/error?code=700016\"
}
create AZURE client
github.com/thanos-io/objstore/client.NewBucket
/bitnami/blacksmith-sandox/thanos-0.34.0/pkg/mod/github.com/thanos-io/[email protected]/client/factory.go:90
main.runStore
/bitnami/blacksmith-sandox/thanos-0.34.0/src/github.com/thanos-io/thanos/cmd/thanos/store.go:298
main.registerStore.func1
/bitnami/blacksmith-sandox/thanos-0.34.0/src/github.com/thanos-io/thanos/cmd/thanos/store.go:237
main.main
/bitnami/blacksmith-sandox/thanos-0.34.0/src/github.com/thanos-io/thanos/cmd/thanos/main.go:133
runtime.main
/opt/bitnami/go/src/runtime/proc.go:267
runtime.goexit
/opt/bitnami/go/src/runtime/asm_amd64.s:1650
preparing store command failed
main.main
/bitnami/blacksmith-sandox/thanos-0.34.0/src/github.com/thanos-io/thanos/cmd/thanos/main.go:135
runtime.main
/opt/bitnami/go/src/runtime/proc.go:267
runtime.goexit
/opt/bitnami/go/src/runtime/asm_amd64.s:1650"
How to reproduce it (as minimally and precisely as possible)::
Let me check if I can find a way to easily replicate it.
Anything else we need to know:
Not sure if this matters:
I use object store file with the following values set:
I will close the issue, it had nothing to do with the application version of Thanos being updated from v0.33.0 to v0.34.0, but with a change in the Bitnami's Helmchart I overlooked.
For the next person who wrongly identified this issue. In the upgrade of the Bitnami Thanos chart from v12.20.2 to v12.20.4 they changed the default value for automountServiceAccountToken from true to false causing the client_id environment variable not being set on the pods.
In case of issues related to exact bucket implementation, please ping corresponded maintainer from list here: https://github.com/thanos-io/thanos/blob/main/docs/storage.md
@vglafirov
Thanos, Prometheus and Golang version used:
Object Storage Provider:
Azure Storage Account
What happened:
After upgrading from v0.33.0 to v0.34.0 the components connecting to Azure Storage (compactor & store gateway) using Workload Identity can't connect. And I get the error below, downgrading back to v0.33.0 and everything works as expected again.
When checking the Service Account which is connected to both components, I do see the client-id.
Annotations: azure.workload.identity/client-id: xxxxxx
But only on deploying the v0.33.0 version of Thanos, does this environment variable get set in the pods
On v0.34.0 the environment variable is empty. Which I assume is causing the issue.
Full logs to relevant components:
Log from the Store Gateway
How to reproduce it (as minimally and precisely as possible)::
Let me check if I can find a way to easily replicate it.
Anything else we need to know:
Not sure if this matters:
I use object store file with the following values set:
I do see this change in the latest release which might be relevant:
#6891
The text was updated successfully, but these errors were encountered: