-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Databricks provider does not support Azure managed identity if more than one identity exists (e.g. ADF, AKS) #38762
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval. |
I'm facing exactly the same problem and I was just starting to look into it myself! Although I considered the same solution as you initially, I feel like just adding these extra params might be only partially adressing the issue. I feel like it would be a much better option to leverage DefaultAzureCredential / ManagedAzureCredential from azure.identity as is done by the Microsoft Azure provider and is recommended by Microsoft. @jtv8, let me know if you'd like me to help with a PR (I have never contributed to Airflow yet, and testing managed identities in a development environment isn't the easiest, so it might take me some time). |
Hi @ghjklw! I agree that it would be better to leverage the abstractions in the official Microsoft libraries - the current implementation is far too low level. However, that would require a substantial rewrite. I certainly don't have the time to do that, and to be honest even setting up a development environment and unit tests for the quick fix I've proposed is a bit beyond my comfort zone given my unfamiliarity with Airflow and its architecture. In an ideal world, I'd like to think Microsoft would contribute this work themselves, given that they now provide both Airflow and Databricks commercially as managed services, and one would expect them to work together via managed identities out of the box. Perhaps one of the project maintainers could kindly suggest it if they have open channels of communication with Microsoft about their commercial offerings? |
Not as far as I know. But we would love Microsoft team to contribute in this and other relevant features. |
Any updates on this? We are currently also running into this issue. To give a bit of more context: |
Let me repeat:
If anyone wants to contribute to fix this this issue, they are free to do so, and we even marked it as "good first issue". If Microsoft team does not care about it and will not step-up and fix it, then well, it has to wait for someone who will. My suggestion is that you raise it to ADF team through your regular support channel, and let them know it's an issue for you and that the best that they can do is to contribute fix directly here. It would be great to see Microsoft contributing to issues related to their platform - similarly as Amazon and Google team are doing. That would be a nice change. You can also consider changing to another managed Airlfow provider :) |
Apache Airflow version
Other Airflow 2 version (please specify below)
If "Other Airflow 2 version" selected, which one?
2.6.3
What happened?
When trying to authenticate with an Azure managed identity, if more than one managed identity exists on the virtual machine (this is always true when using Azure Managed Airflow, and common when using Azure Kubernetes Service), the connection will return the following error:
What you think should happen instead?
The solution to this problem is to allow the user to supply values to be passed to the Azure Instance Metadata Service token endpoint as the
object_id
,client_id
andmsi_res_id
parameters, as documented here:https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-to-use-vm-token#get-a-token-using-http
Here's an example implementation showing how airflow/providers/databricks/hooks/databricks_base.py could be changed to support this:
Before:
After:
How to reproduce
use_azure_managed_identity
set totrue
Operating System
n/a
Versions of Apache Airflow Providers
No response
Deployment
Microsoft ADF Managed Airflow
Deployment details
No response
Anything else?
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: