Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAP-692] [Bug] snowflake-connector-python dependency version too strict #687

Closed
2 tasks done
ivanstillfront opened this issue Jul 10, 2023 · 5 comments
Closed
2 tasks done
Labels
bug Something isn't working

Comments

@ivanstillfront
Copy link

Is this a new bug in dbt-snowflake?

  • I believe this is a new bug in dbt-snowflake
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When using dbt with Airflow 2.5.1, python package installation fails because Airflow constraints snowflake-connector-python version to 2.9.0 which is incompatible to dbt-snowflake which requires 3.0.0

Expected Behavior

Package installation should not fail.

Steps To Reproduce

Create a requirements.txt file with content:

--constraint https://raw.githubusercontent.com/apache/airflow/constraints-2.5.1/constraints-3.10.txt
dbt-snowflake

pip install -r requirements.txt will fail

Relevant log output

The conflict is caused by:
    dbt-snowflake 1.5.2 depends on snowflake-connector-python[secure-local-storage]~=3.0
    dbt-snowflake 1.5.1 depends on snowflake-connector-python[secure-local-storage]~=3.0
    dbt-snowflake 1.5.0 depends on snowflake-connector-python[secure-local-storage]~=3.0
    dbt-snowflake 1.4.3 depends on snowflake-connector-python[secure-local-storage]~=3.0
    dbt-snowflake 1.4.2 depends on snowflake-connector-python[secure-local-storage]~=3.0
    dbt-snowflake 1.4.1 depends on snowflake-connector-python[secure-local-storage]<2.8.2 and >=2.4.1
    dbt-snowflake 1.4.0 depends on snowflake-connector-python[secure-local-storage]<2.8.2 and >=2.4.1
    dbt-snowflake 1.3.2 depends on snowflake-connector-python[secure-local-storage]~=3.0
    dbt-snowflake 1.3.1 depends on snowflake-connector-python[secure-local-storage]~=3.0
    dbt-snowflake 1.3.0 depends on cryptography<37.0.0 and >=3.2
    dbt-snowflake 1.2.1 depends on snowflake-connector-python[secure-local-storage]~=3.0
    dbt-snowflake 1.2.0 depends on cryptography<37.0.0 and >=3.2
    dbt-snowflake 1.1.1 depends on snowflake-connector-python[secure-local-storage]~=3.0
    dbt-snowflake 1.1.0 depends on snowflake-connector-python[secure-local-storage]<2.8.0 and >=2.4.1
    dbt-snowflake 1.0.1 depends on snowflake-connector-python[secure-local-storage]<2.8.0 and >=2.4.1
    dbt-snowflake 1.0.0 depends on snowflake-connector-python[secure-local-storage]<2.8.0 and >=2.4.1
    dbt-snowflake 0.21.1 depends on cryptography<4 and >=3.2
    dbt-snowflake 0.21.0 depends on cryptography<4 and >=3.2
    dbt-snowflake 0.20.2 depends on cryptography<4 and >=3.2
    dbt-snowflake 0.20.1 depends on cryptography<4 and >=3.2
    dbt-snowflake 0.20.0 depends on snowflake-connector-python[secure-local-storage]~=2.4.1
    dbt-snowflake 0.19.2 depends on cryptography<4 and >=3.2
    dbt-snowflake 0.19.1 depends on cryptography<4 and >=3.2
    dbt-snowflake 0.19.0 depends on cryptography<4 and >=3.2
    dbt-snowflake 0.18.2 depends on snowflake-connector-python[secure-local-storage]==2.2.10
    dbt-snowflake 0.18.1 depends on snowflake-connector-python[secure-local-storage]==2.2.10
    dbt-snowflake 0.18.0 depends on snowflake-connector-python==2.2.10
    dbt-snowflake 0.17.2 depends on azure-storage-blob<12.0.0
    dbt-snowflake 0.17.1 depends on azure-storage-blob<12.0.0
    dbt-snowflake 0.17.0 depends on azure-storage-blob<12.0.0
    dbt-snowflake 0.16.1 depends on azure-storage-blob<12.0.0
    dbt-snowflake 0.16.0 depends on azure-storage-blob<12.0.0
    dbt-snowflake 0.15.3 depends on azure-storage-blob~=2.1
    dbt-snowflake 0.15.2 depends on snowflake-connector-python<2.1 and >=1.6.12
    dbt-snowflake 0.15.1 depends on snowflake-connector-python<2.1 and >=1.6.12
    dbt-snowflake 0.15.0 depends on snowflake-connector-python<2.1 and >=1.6.12
    dbt-snowflake 0.14.4 depends on snowflake-connector-python==2.0.3
    The user requested (constraint) azure-storage-blob==12.14.1
    The user requested (constraint) snowflake-connector-python==2.9.0
    The user requested (constraint) cryptography==38.0.4

Environment

- OS:
- Python:
- dbt-core:
- dbt-snowflake:

Additional Context

The snowflake-connector-python requirement was bumped to 3.0.0 because of a snyk vulnerability see: 967a8e9

But it would have been sufficient to bump it to 2.8.2 or higher. The dbt-snowflake library can be a lot more flexible when the requirement is defined like so:

"snowflake-connector-python[secure-local-storage]>=2.8.2"

Happy to push a PR if the maintainers agree with this change.

@ivanstillfront ivanstillfront added bug Something isn't working triage labels Jul 10, 2023
@github-actions github-actions bot changed the title [Bug] snowflake-connector-python dependency version too strict [ADAP-692] [Bug] snowflake-connector-python dependency version too strict Jul 10, 2023
@dataders
Copy link
Contributor

@ivanstillfront thanks for opening this and doing your homework! I understand the frustration of too strict dependencies.

My hesitation is that we often experience user-facing issues that require fixes in the upstream connector library. The PR in which the snowflake connector version was bumped, #476, makes reference to #393 (correspondingly snowflakedb/snowflake-connector-python#1274)

As we speak, we're finally on a path to resolving a long-standing multi-threading seg fault issue that dbt-snowflake users have been experiencing in both Core and Cloud that will also likely be resolved in snowflakedb/snowflake-connector-python#1627. Once this happens we plan to bump the minimum required version of the connector to 3.0.5 for at least versions 1.5 and 1.6 of dbt-snowflake.

I'm not trying to say that we don't want to fix your problem, rather that's the perspective from which we've been thinking lately. Maybe there's a middle ground here? Can we open a PR with airflow to bump the upper limit on their version constraint? I'd love to hear more about your perspective on next steps. cheers

@ivanstillfront
Copy link
Author

Thank you @dataders , @nenkie76 has opened an issue with AWS MWAA but I doubt they will modify the constraints because they are provided by Airflow

Since AWS maintains the constraints for their managed Airflow system our hands are tied in this matter. Our only options are to fork dbt-snowflake or run our own Airflow deployment.

@dataders
Copy link
Contributor

@ivanstillfront appreciate the teamwork! I accidentally went down a rabbit hole and have some information to share.

your file, constraints-2.5.1/constraints-3.10.txt does indeed have an earlier version of the connector library, but the version I saw on mentioned on the main branch of the Airflow repo is constraints-main/constraints-3.8.txt, which actually has the latest version of snowflake_connector_python pinned.

aws/aws-mwaa-local-runner#243 was instructive to discover that this AWS managed airflow repo depends on another PyPI package sauce, apache-airflow-providers-snowflake (PyPI). The latest version of this package is 4.3.0

Therefore I don't believe that apache-airflow-providers-snowflake is the issue because, iiuc, this provider pacakge specifies snowflake-connector-python>=2.4.1 (sauce)

Instead, I'm very suspicious of how aws/aws-mwaa-local-runner's Docker image is configured. There is a hard-coded, committed docker/config/constraints.txt in which apache-airflow-providers-snowflake is hard pinned to be 4.0.2. Until April 11, this requirement was 3.3.0.

The first two lines of that file are telling

# This constraints file was automatically generated on 2023-01-18T18:46:04Z
# via "eager-upgrade" mechanism of PIP. For the "v2-5-test" branch of Airflow.

I still am very uncertain as to how Airflow uses a static GitHub links to auto-generate these constraint files, but certain enough to be suspicious of hard-coding.

Airflow's guidance on the constraints file, apache/airflow/blob/main/constraints/README.md, says the following

For development use only, you can store your own version of constraint files used during CI image build here and refer to them during building of the image using --airflow-constraints location constraints/constraints.txt This allows you to iterate on dependencies without having to run --upgrade-to-newer-dependencies flag continuously.

I think the real "issue" is that this constraints file is being used "in production" when it really is only for a temporary, local development and CI build environments. The smoking gun for me is that all of those dependencies are hard-pinned (`==) which by definition is not flexible.

The workaround, imo, would be to submit a pull request to aws/aws-mwaa-local-runner in which you updated the existing constraints file to refer to a newer version?

The long-term solution is to improve the version specification strategy that is implemented by aws/aws-mwaa-local-runner

@dataders dataders removed the triage label Jul 12, 2023
@dataders
Copy link
Contributor

@ivanstillfront closing this issue for now. please re-open if you think there's a way we can be of help here

@ivanstillfront
Copy link
Author

@dataders thank you for that elaborate investigation, sorry I could not reply sooner. You are correct, AWS is using the constraints "in production" which is a huge PITA to work with. Here is their documentation on this matter: https://docs.aws.amazon.com/mwaa/latest/userguide/working-dags-dependencies.html#working-dags-dependencies-syntax-create

Long term, we will most likely stop using AWS managed Airflow and roll our own deployment. For now, we have forked dbt-snowflake and patched install_requires in setup.py to satisfy the constraints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants