Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pyarrow version 12.0.0, having a critical vulnerability, upgrade to new pyarrow version is causing a Error in superset #26153

Closed
3 tasks
nagarajmmu opened this issue Nov 30, 2023 · 8 comments
Assignees

Comments

@nagarajmmu
Copy link

How to reproduce the bug

  1. Superset 3.0.2, is having pyarrow version 12.0.0.
  2. But pyarrow version 12.0.0 is having a critical vulnerability.
  3. Upgrade the pyarrow to 14.0.1 to resolve the vulnerability is successful in supserset.
  4. Once pyarrow is upgraded, start the superset.
  5. Superset failed to start with below error.

ERROR:

[notice] A new release of pip is available: 23.0.1 -> 23.3.1
[notice] To update, run: pip install --upgrade pip
Upgrading DB schema...
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/pkg_resources/__init__.py", line 568, in _build_master
    ws.require(__requires__)
  File "/usr/local/lib/python3.9/site-packages/pkg_resources/__init__.py", line 886, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/usr/local/lib/python3.9/site-packages/pkg_resources/__init__.py", line 777, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (pyarrow 14.0.1 (/usr/local/lib/python3.9/site-packages), Requirement.parse('pyarrow<13,>=12.0.0'), {'apache-superset'})

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/superset", line 33, in <module>
    sys.exit(load_entry_point('apache-superset', 'console_scripts', 'superset')())
  File "/usr/local/bin/superset", line 25, in importlib_load_entry_point
    return next(matches).load()
  File "/usr/local/lib/python3.9/importlib/metadata.py", line 86, in load
    module = import_module(match.group('module'))
  File "/usr/local/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/app/superset/cli/main.py", line 28, in <module>
    from superset.cli.lib import normalize_token
  File "/app/superset/cli/lib.py", line 20, in <module>
    from superset import config
  File "/app/superset/config.py", line 38, in <module>
    import pkg_resources
  File "/usr/local/lib/python3.9/site-packages/pkg_resources/__init__.py", line 3243, in <module>
    def _initialize_master_working_set():
  File "/usr/local/lib/python3.9/site-packages/pkg_resources/__init__.py", line 3226, in _call_aside
    f(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/pkg_resources/__init__.py", line 3255, in _initialize_master_working_set
    working_set = WorkingSet._build_master()
  File "/usr/local/lib/python3.9/site-packages/pkg_resources/__init__.py", line 570, in _build_master
    return cls._build_from_requirements(__requires__)
  File "/usr/local/lib/python3.9/site-packages/pkg_resources/__init__.py", line 583, in _build_from_requirements
    dists = ws.resolve(reqs, Environment())
  File "/usr/local/lib/python3.9/site-packages/pkg_resources/__init__.py", line 772, in resolve
    raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'pyarrow<13,>=12.0.0' distribution was not found and is required by apache-superset

Expected results

Once pyarrow is upgraded to 14.0.1, superset should run and work as normal as before upgrade.

Actual results

After pyarrow is upgraded to 14.0.1, superset is not running. failed to run with above error

Environment

(please complete the following information):

  • superset version: superset 3.0.2

Checklist

Make sure to follow these steps before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • I have reproduced the issue with at least the latest released version of superset.
  • I have checked the issue tracker for the same issue and I haven't found one similar.

Additional context

Please let us know, when pyarrow would be upgraded to 14.0.1 in superset, otherwise, please let us know, if there is any work around to fil above issue in superset after pyarrow is upgraded.

Thanks
Nagaraj M M
@asf-rm

@artofcomputing
Copy link
Contributor

CVE ID: CVE-2023-47248

@justmike1
Copy link
Contributor

It is because the hardcoded version control in setup.py

@cwegener
Copy link
Contributor

cwegener commented Dec 7, 2023

It is because the hardcoded version control in setup.py

Not quite. The root cause of the startup error is deeper than that ...

pkg_resources.DistributionNotFound: The 'pyarrow<13,>=12.0.0' distribution was not found and is required by apache-superset

pkg_resources has been deprecated for a while now, precisely because of issues like this (What is happening here is that when the pkg_resources methods are being called in Superset, the whole setup.py will actually get parsed and evaluated which is of course totally unexpected.)

I thought I had already caught all the pkg_resources removals in these two changes ... but I think I actually forgot about the last occurrence of pkg_resources

#24578

#24514

In summary, there are two issues that are getting conflated:

  1. changing the version of a direct dependency that is listed in setup.py to a version that is does not meet the constraints that are defined in setup.py will prevent superset from starting -> this will be resolve by removing the last occurrence of pkg_resources
  2. PyArrow should be bumped to 14.0.1, which is being done in fix: bump pyarrow constraints (CVE-2023-47248) #26187

@nagarajmmu
Copy link
Author

Hi @cwegener

Thanks for the update, by your comment I can understand that it is partially fixed. But hope superset will not face any starting error.

Thanks
Nagaraj M M

@nagarajmmu
Copy link
Author

Hi Team

May I know in which version pyarrow version is upgraded to 14.0.1 version, when I try to get superset version 3.0.2, still it is having pyarrow old version.

I verified in master branch also, but still pyarrow is old version.

Please let me know the version, in which pyarrow is upgraded.

Thanks in advance
Nagaraj M M

@nagarajmmu
Copy link
Author

Hi Team

We are in a development phase with visualization, in our application, superset is the visualization part, so we are expecting superset should be vulnerability free.
As we observed in above conversation, "pyarrow" is upgraded to 14.0.1 version, but in which stable version of superset, we may expect this change and when we can expect.

If we get any approximate time line, it would be great.

Thanks
Nagaraj M M

@michael-s-molina
Copy link
Member

Hi @nagarajmmu. pyarrow 14.0.1 should be available in Superset 3.0.3 which will be released in January.

@nagarajmmu
Copy link
Author

Hi @michael-s-molina

Thanks for the update, I will upgrade my superset to 3.0.3.

Thanks
Nagaraj M M

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants