-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connection test: [ERROR] - dbt-databricks behind proxy #111
Comments
@thuanvan thanks for filing this. Investigating whether our Python connector supports HTTP proxies. Will get back to you! |
getting this in curl test Error 500 javax.servlet.ServletException: org.apache.thrift.transport.TTransportException |
@thuanvan I verified that we don't support proxy yet. We'll get this prioritized on our roadmap. |
Thanks for confirming. Thank you for prioritizing it. We'll look into on how to get a proxy-bypass. |
odd. since we have working instances where we go through proxy. Can you elaborate? |
@thuanvan I'm waiting for the engineer to come back from vacation. He'll look into it. |
I did some analysis previously: The dbt-databricks adaptor is based on thrift protocol. It is RPC not REST. And I cannot find it supports PROXY (in an easy way). Our team's workaround is to use databricks IP whitelisting to protect the databricks workspace. |
@xg1990 @thuanvan we are going to add proxy support to the Python connector first (databricks/databricks-sql-python#22). Then we will add support to dbt-databricks. |
When databricks/databricks-sql-python#22 is fixed will this also fixed for this issue? |
@thuanvan we're still waiting for databricks/databricks-sql-python#22 to land. |
Hey folks just letting you know that we have a fix for this under review in |
@susodapop I have just encountered this problem, however your suggestion did not fix it unfortunately. I just get a lot of "Hey I was called!" messages before ending up with "failed to connect" |
@alexdiem thanks for the report! The There's not enough information to reproduce your issue in your message. What values did you use for your proxy environment variables? Of course redact any sensitive information. |
It is the exact same problem as Thuan has (we are colleagues in the same office), and several others have it as well. I have set |
Looks like the problem is in thrift.transport.THttpClient: Problem code: @staticmethod
def basic_proxy_auth_header(proxy):
if proxy is None or not proxy.username:
return None
ap = "%s:%s" % (urllib.parse.unquote(proxy.username),
urllib.parse.unquote(proxy.password))
cr = base64.b64encode(ap).strip()
return "Basic " + cr In my test, the HTTP(S)_PROXY environment variables values are correctly captured but ap is a "regular" string as opposed to a byte string thus the base64.b64encode() call fails. @staticmethod
def basic_proxy_auth_header(proxy):
if proxy is None or not proxy.username:
return None
ap = "%s:%s" % (urllib.parse.unquote(proxy.username),
urllib.parse.unquote(proxy.password))
cr = base64.b64encode(ap.encode()).decode().strip()
return "Basic " + cr However, since the problem is in the thrift package, we can't simply fix it in this project... |
@msdotnetclr we've actually fixed this in databricks-sql-connector without needing to update the upstream thrift dependency. It needs to merge and be deployed to Pypi, then we'll update the dbt-databricks dependency and proxies will work. |
Here's the PR that fixes it databricks-sql-connector: databricks/databricks-sql-python#81 |
Ah, nice. I only started to play around with the connector this morning and did not get to look into other linked issues. Good to know there is a better fix already! |
The fix has merged into databricks-sql-connector and is part of release v2.5.0. I'll open a PR here that bumps the dependency so we pick up the proxy fix. |
Describe the bug
A clear and concise description of what the bug is. What command did you run? What happened?
dbt debug gives error
Connection test: [ERROR]
1 check failed:
dbt was unable to connect to the specified database.
The database returned the following error:
ENV set
HTTP_PROXY
HTTPS_PROXY
Does not seemed that proxy environment are being used
curl to host/http_path is OK
Steps To Reproduce
In as much detail as possible, please provide steps to reproduce the issue. Sample data that triggers the issue, example model code, etc is all very helpful here.
dbt debug
Expected behavior
A clear and concise description of what you expected to happen.
connection test OK
Screenshots and log output
If applicable, add screenshots or log output to help explain your problem.
System information
The output of
dbt --version
:Core:
Plugins:
The operating system you're using:
ubuntu
The output of
python --version
:Python 3.8.10
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: