Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Error in the Materializer for integration with Langchain >= 0.0.325 #2012

Closed
1 task done
leoregino opened this issue Nov 2, 2023 · 2 comments
Closed
1 task done
Assignees
Labels
bug Something isn't working

Comments

@leoregino
Copy link

leoregino commented Nov 2, 2023

Contact Details [Optional]

No response

System Information

ZENML_LOCAL_VERSION: 0.42.1
ZENML_SERVER_VERSION: 0.42.1
ZENML_SERVER_DATABASE: mysql
ZENML_SERVER_DEPLOYMENT_TYPE: gcp
ZENML_CONFIG_DIR: xxxx
ZENML_LOCAL_STORE_DIR: xxxx
ZENML_SERVER_URL: (I'm a CAB client)
ZENML_ACTIVE_REPOSITORY_ROOT: xxxx
PYTHON_VERSION: 3.10.9
ENVIRONMENT: native
SYSTEM_INFO: {'os': 'mac', 'mac_version': '13.5.2'}
ACTIVE_WORKSPACE: default
ACTIVE_STACK: local
ACTIVE_USER: xxxx
TELEMETRY_STATUS: enabled
ANALYTICS_CLIENT_ID: xxxx
ANALYTICS_USER_ID: xxxx
ANALYTICS_SERVER_ID: xxxx
INTEGRATIONS: ['gcp', 'kaniko', 'kubeflow', 'mlflow', 'openai', 'pillow', 'pytorch', 'scipy', 'sklearn', 'slack', 'langchain']
PACKAGES: {'regex': '2023.10.3', 'gcsfs': '2023.10.0', 'fsspec': '2023.10.0', 'certifi': '2023.7.22', 'jsonschema-specifications':
'2023.7.1', 'pytz': '2022.7.1', 'setuptools': '65.5.0', 'cryptography': '41.0.5', 'kubernetes': '25.3.0', 'pyzmq': '25.1.1', 'gevent':
'23.9.1', 'aiofiles': '23.2.1', 'packaging': '23.2', 'attrs': '23.1.0', 'argon2-cffi': '23.1.0', 'azure-mgmt-resource': '23.0.1',
'pip': '22.3.1', 'argon2-cffi-bindings': '21.2.0', 'isoduration': '20.11.0', 'gunicorn': '20.1.0', 'rich': '12.6.0', 'pyarrow':
'11.0.0', 'pillow': '10.1.0', 'ipython': '8.17.2', 'jupyter-client': '8.5.0', 'tenacity': '8.2.3', 'click': '8.1.3', 'nbconvert':
'7.10.0', 'ipywidgets': '7.8.1', 'overrides': '7.4.0', 'notebook': '7.0.6', 'ipykernel': '6.26.0', 'importlib-metadata': '6.8.0',
'tornado': '6.3.3', 'docker': '6.1.3', 'zope.interface': '6.1', 'bleach': '6.1.0', 'importlib-resources': '6.1.0', 'multidict':
'6.0.4', 'pyyaml': '6.0.1', 'traitlets': '5.13.0', 'psutil': '5.9.6', 'nbformat': '5.9.2', 'jupyter-core': '5.5.0', 'cachetools':
'5.3.2', 'decorator': '5.1.1', 'smmap': '5.0.1', 'zope.event': '5.0', 'tqdm': '4.66.1', 'fonttools': '4.43.1', 'transformers':
'4.30.1', 'jsonschema': '4.19.2', 'beautifulsoup4': '4.12.2', 'rsa': '4.9', 'typing-extensions': '4.8.0', 'pexpect': '4.8.0', 'gitdb':
'4.0.11', 'jupyterlab': '4.0.7', 'async-timeout': '4.0.3', 'bcrypt': '4.0.1', 'slack-sdk': '3.23.0', 'google-cloud-build': '3.21.0',
'protobuf': '3.20.3', 'marshmallow': '3.20.1', 'zipp': '3.17.0', 'filelock': '3.13.1', 'platformdirs': '3.11.0', 'orjson': '3.9.10',
'aiohttp': '3.8.6', 'matplotlib': '3.8.1', 'nltk': '3.8.1', 'anyio': '3.7.1', 'widgetsnbextension': '3.6.6', 'google-cloud-bigquery':
'3.6.0', 'markdown': '3.5.1', 'google-cloud-logging': '3.5.0', 'asyncio': '3.4.3', 'idna': '3.4', 'charset-normalizer': '3.3.2',
'oauthlib': '3.2.2', 'networkx': '3.2.1', 'threadpoolctl': '3.2.0', 'gitpython': '3.1.40', 'jinja2': '3.1.2', 'prompt-toolkit':
'3.0.39', 'mistune': '3.0.2', 'greenlet': '3.0.1', 'werkzeug': '3.0.1', 'uritemplate': '3.0.1', 'tritonclient': '2.36.0',
'google-cloud-container': '2.33.0', 'requests': '2.31.0', 'jupyterlab-server': '2.25.0', 'google-auth': '2.23.4',
'google-cloud-bigquery-storage': '2.22.0', 'pycparser': '2.21', 'fastjsonschema': '2.18.1', 'google-cloud-secret-manager': '2.16.4',
'pygments': '2.16.1', 'babel': '2.13.1', 'google-cloud-storage': '2.13.0', 'google-api-core': '2.12.0', 'google-cloud-scheduler':
'2.11.2', 'jupyter-server': '2.9.1', 'types-python-dateutil': '2.8.19.14', 'python-dateutil': '2.8.2', 'pyjwt': '2.8.0',
'google-resumable-media': '2.6.0', 'soupsieve': '2.5', 'pyparsing': '2.4.7', 'asttokens': '2.4.1', 'jsonpointer': '2.4', 'flask':
'2.3.3', 'google-cloud-core': '2.3.3', 'termcolor': '2.3.0', 'sentence-transformers': '2.2.2', 'mlflow': '2.2.2', 'cloudpickle':
'2.2.1', 'jupyter-lsp': '2.2.0', 'markupsafe': '2.1.3', 'itsdangerous': '2.1.2', 'torch': '2.1.0', 'python-json-logger': '2.0.7',
'async-lru': '2.0.4', 'geventhttpclient': '2.0.2', 'kafka-python': '2.0.2', 'tomli': '2.0.1', 'executing': '2.0.1',
'googleapis-common-protos': '1.61.0', 'grpcio': '1.59.2', 'grpcio-status': '1.48.2', 'jsonpatch': '1.33', 'azure-core': '1.29.5',
'urllib3': '1.26.18', 'google-cloud-aiplatform': '1.26.0', 'numpy': '1.22.4', 'proto-plus': '1.22.3', 'six': '1.16.0', 'cffi':
'1.16.0', 'wrapt': '1.15.0', 'google-cloud-functions': '1.13.3', 'python-rapidjson': '1.13', 'webcolors': '1.13',
'google-api-python-client': '1.12.11', 'sympy': '1.12', 'scipy': '1.11.3', 'pydantic': '1.10.13', 'google-cloud-resource-manager':
'1.10.4', 'backoff': '1.10.0', 'yarl': '1.9.2', 'kfp': '1.8.22', 'shapely': '1.8.5.post1', 'kfp-server-api': '1.8.5',
'pydata-google-auth': '1.8.2', 'send2trash': '1.8.2', 'alembic': '1.8.1', 'distro': '1.8.0', 'debugpy': '1.8.0', 'passlib': '1.7.4',
'blinker': '1.7.0', 'websocket-client': '1.6.4', 'monotonic': '1.6', 'nest-asyncio': '1.5.8', 'pandas': '1.5.3', 'fqdn': '1.5.1',
'google-crc32c': '1.5.0', 'pandocfilters': '1.5.0', 'sqlalchemy': '1.4.50', 'kiwisolver': '1.4.5', 'analytics-python': '1.4.post1',
'absl-py': '1.4.0', 'frozenlist': '1.4.0', 'azure-mgmt-core': '1.4.0', 'mlserver': '1.3.5', 'mlserver-mlflow': '1.3.5', 'joblib':
'1.3.2', 'google-cloud-appengine-logging': '1.3.2', 'aiosignal': '1.3.1', 'requests-oauthlib': '1.3.1', 'arrow': '1.3.0',
'uri-template': '1.3.0', 'mpmath': '1.3.0', 'sniffio': '1.3.0', 'deprecated': '1.2.14', 'querystring-parser': '1.2.4', 'mako':
'1.2.4', 'cloud-sql-python-connector': '1.2.3', 'scikit-learn': '1.2.2', 'tinycss2': '1.2.1', 'azure-common': '1.1.28',
'jupyterlab-widgets': '1.1.7', 'exceptiongroup': '1.1.3', 'contourpy': '1.1.1', 'db-dtypes': '1.1.1', 'google-auth-oauthlib': '1.1.0',
'brotli': '1.1.0', 'pymysql': '1.0.3', 'mypy-extensions': '1.0.0', 'python-dotenv': '1.0.0', 'fastapi': '0.89.1', 'numba': '0.58.1',
'shap': '0.43.0', 'zenml': '0.42.1', 'wheel': '0.41.3', 'llvmlite': '0.41.1', 'sqlalchemy-utils': '0.38.3', 'referencing': '0.30.2',
'openai': '0.27.9', 'asyncpg': '0.27.0', 'uvicorn': '0.23.2', 'starlette': '0.22.0', 'pandas-gbq': '0.19.1', 'jedi': '0.19.1',
'httplib2': '0.19.1', 'uvloop': '0.19.0', 'validators': '0.18.2', 'databricks-cli': '0.18.0', 'prometheus-client': '0.18.0',
'huggingface-hub': '0.18.0', 'terminado': '0.17.1', 'starlette-exporter': '0.16.0', 'torchvision': '0.16.0', 'docstring-parser':
'0.15', 'h11': '0.14.0', 'tokenizers': '0.13.3', 'grpc-google-iam-v1': '0.12.6', 'cycler': '0.12.1', 'rpds-py': '0.10.6',
'python-terraform': '0.10.1', 'requests-toolbelt': '0.10.1', 'json5': '0.9.14', 'commonmark': '0.9.1', 'typer': '0.9.0',
'typing-inspect': '0.9.0', 'tabulate': '0.9.0', 'parso': '0.8.3', 'aiokafka': '0.8.1', 'jupyter-events': '0.8.0', 'nbclient': '0.8.0',
'defusedxml': '0.7.1', 'ptyprocess': '0.7.0', 'py-grpc-prometheus': '0.7.0', 'stack-data': '0.6.3', 'dataclasses-json': '0.6.1',
'isodate': '0.6.1', 'webencodings': '0.5.1', 'fire': '0.5.0', 'pyasn1': '0.5.0', 'sqlparse': '0.4.4', 'jupyter-server-terminals':
'0.4.4', 'entrypoints': '0.4', 'safetensors': '0.4.0', 'pyasn1-modules': '0.3.0', 'click-params': '0.3.0', 'wcwidth': '0.2.9',
'google-cloud-audit-log': '0.2.5', 'notebook-shim': '0.2.3', 'pure-eval': '0.2.2', 'jupyterlab-pygments': '0.2.2', 'ipython-genutils':
'0.2.0', 'sentencepiece': '0.1.99', 'kfp-pipeline-spec': '0.1.16', 'strip-hints': '0.1.10', 'pgvector': '0.1.8', 'matplotlib-inline':
'0.1.6', 'rfc3339-validator': '0.1.4', 'comm': '0.1.4', 'appnope': '0.1.3', 'google-auth-httplib2': '0.1.1', 'rfc3986-validator':
'0.1.1', 'langchain': '0.0.325', 'langsmith': '0.0.56', 'sqlmodel': '0.0.11', 'slicer': '0.0.7', 'sqlalchemy2-stubs': '0.0.2a36'}

CURRENT STACK

Name: local
ID: xxxx
Shared: No
User: xxxx
Workspace: xxxx

ORCHESTRATOR: default

Name: default
ID: xxxx
Type: orchestrator
Flavor: local
Configuration: {}
Shared: No
User: xxxx
Workspace: default / xxxx

ARTIFACT_STORE: default

Name: default
ID: xxxx
Type: artifact_store
Flavor: local
Configuration: {'path': ''}
Shared: No
User: xxxx
Workspace: default / xxxx

EXPERIMENT_TRACKER: gke_mlflow_experiment_tracker

Name: gke_mlflow_experiment_tracker
ID: xxxxx
Type: experiment_tracker
Flavor: mlflow
Configuration: {'experiment_name': None, 'nested': False, 'tags': {}, 'tracking_uri': 'http://34.140.227.247/mlflow/',
'tracking_username': '', 'tracking_password': '', 'tracking_token': '********', 'tracking_insecure_tls': False,
'databricks_host': None}
Shared: No
User: xxxx
Workspace: default / xxxx

ALERTER: slack_alerter

Name: slack_alerter
ID: xxxx
Type: alerter
Flavor: slack
Configuration: {'slack_token': '********', 'default_slack_channel_id': 'xxxxxx'}
Shared: No
User: xxxx
Workspace: default / xxxx

IMAGE_BUILDER: local_builder

Name: local_builder
ID: xxxxx
Type: image_builder
Flavor: local
Configuration: {}
Shared: No
User: xxxx
Workspace: default / xxxx

What happened?

When upgrading Langchain' version to 0.0.325, the Materializer crash with the following error:

ImportError: cannot import name 'VectorStore' from 'langchain.vectorstores'

This can be solved by going to the file vector_store_materializer.py and changing a line of code like this:

OLD CODE:
from langchain.vectorstores import VectorStore

NEW CODE:
from langchain.vectorstores.base import VectorStore

Although this error could be avoided by simply downgrading Langchain's version, it's important to note that all versions prior 0.0.325 have a **critical vulnerability/security issue** as mention in here: GHSA-prgp-w7vf-ch62

So, it's very important to upgrade Langchain versions and fix this issue.

Reproduction steps

  1. Install zenml >= 0.42.1
  2. Install langchain >= 0.0.325
  3. Create a pipeline using langchain in a file run.py
  4. Execute python run.py

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@leoregino leoregino added the bug Something isn't working label Nov 2, 2023
@strickvl
Copy link
Contributor

strickvl commented Nov 2, 2023

Thank you for this report and your suggestion. As it happens, next week we'll be upgrading this dependency + related things like materializers, so this will be fixed. I'll write here once it's merged onto develop.

@strickvl
Copy link
Contributor

We bumped our langchain integration as I mentioned, so this issue should now be fixed. Thanks for reporting it here @leoregino!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants