Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pip failures can result in incomplete scan of wheel dependency graph #497

Closed
kislyuk opened this issue Aug 26, 2017 · 7 comments
Closed

Comments

@kislyuk
Copy link
Contributor

kislyuk commented Aug 26, 2017

It looks like the wheel dependency graph scanning process can be silently truncated. This results in chalice silently disregarding dependencies when handling wheels in "cross-compiling" mode (deploying from mac os to linux).

class SubprocessPip(object):
"""Wrapper around calling pip through a subprocess."""
def main(self, args, env_vars=None, shim=None):
# type: (List[str], OptStrMap, OptStr) -> Tuple[int, Optional[bytes]]
if env_vars is None:
env_vars = {}
if shim is None:
shim = ''
env_vars.update(subprocess_python_base_environ)
python_exe = sys.executable
run_pip = 'import pip; pip.main(%s)' % args
exec_string = '%s%s' % (shim, run_pip)
invoke_pip = [python_exe, '-c', exec_string]
p = subprocess.Popen(invoke_pip,
stdout=subprocess.PIPE, stderr=subprocess.PIPE,
env=env_vars)
_, err = p.communicate()
rc = p.returncode
return rc, err

While it appears as though the error code is passed on, in fact pip errors are ignored (other than gross ones like if pip is missing entirely). To fix this, you'd need to wrap pip.main() in sys.exit(pip_main()).

@kislyuk
Copy link
Contributor Author

kislyuk commented Aug 26, 2017

This happens outside cross-compiling scenarios too. The errors that I'm seeing in pip download and pip install all have to do with sanitizing the environment variables, in particular PATH. Many packages use the find_executable() function from setuptools/distutils in the build phase. That function tries to access os.environ['PATH'] and blows up. The protobuf package uses find_executable() on load in setup.py, causing pip download to fail as I described above:

Collecting protobuf==3.3.0 (from -r requirements.txt (line 36))
  Using cached protobuf-3.3.0.tar.gz
  Saved /private/var/folders/x3/qctfmn7j0lq95rn608ymq_4w0000gn/T/tmpkwb38x1v/protobuf-3.3.0.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/private/tmp/pip-build-g70hvnjq/protobuf/setup.py", line 36, in <module>
        protoc = find_executable("protoc")
      File "/usr/local/Cellar/python3/3.6.0/Frameworks/Python.framework/Versions/3.6/lib/python3.6/distutils/spawn.py", line 176, in find_executable
        path = os.environ['PATH']
      File "/Users/akislyuk/projects/data-store/v2/bin/../lib/python3.6/os.py", line 669, in __getitem__
        raise KeyError(key) from None
    KeyError: 'PATH'

@kislyuk
Copy link
Contributor Author

kislyuk commented Aug 26, 2017

So in addition to propagating the pip exit status, I would consider not sanitizing the environment variables (at minimum PATH) so aggressively.

@stealthycoin
Copy link
Contributor

The problem here is that the pip errors are not granular enough to be useful the way I am using pip. If it told me it was a networking error specifically I could retry or failout with a useful message. But for most error cases I DO want to ignore the error since I am more interested in what packages got downloaded into my working directory since in general I combine a bunch of mutually optional steps that can fail and still have the overall deployment succeed. The only real error is a networking error.

One idea I had was to make a pull request to pip to separate out their error codes so that you can tell whether or not a particular error was networking related or not. But the way pip is now there is no useful information I can learn from the fact that pip failed parsing the output text which is possibly subject to change and not something I want to depend on. Simply wrapping pip with sys.exit is not acceptable since that breaks a lot of situations where a valid deployment package would have been built.

If you can give me a specific requirements.txt file that you think should be deployable I can take a look at it and see what can be done. I have not thought about or looked at the cross compiling cases at all, so thats a new set of requirements I have not looked into. Basically you just need path to be passed down through the subprocesses?

@kislyuk
Copy link
Contributor Author

kislyuk commented Aug 27, 2017

I think PATH (at minimum) should be passed down, yes.

Here is a requirements.txt that should work, but does not, in my testing:

protobuf==3.3.0

@kislyuk
Copy link
Contributor Author

kislyuk commented Aug 27, 2017

I'm not sure what the conditions are under which pip download failures should be disregarded. As it stands, failure at some point in the requirements.txt scan (intermittent or not) will result in all further dependencies listed after it, and their subgraphs, being disregarded and excluded from the deployment package. I think that's a problem.

@stealthycoin
Copy link
Contributor

This looks like it can be separated out into two issues.

The other is that subprocess errors do not get propagated up to Chalice from pip which is correct. I will open a separate issue for that as well. #500

One related to environment variables being passed down to the subprocess, which I believe is a bug, since I think this worked at one point and doesn't now due to some changes for the packager related to optional C extensions. #501

Given your example of protobuf there is actually another issue if you get it deployed where it won't be importable under the google namespace with python 2.7 which is what I thought you were initially talking about. That has nothing to do with Chalice itself and only fails in python 2.7 on Lambda. I followed up with the Lambda team and they are aware of the issue. Until it gets fixed I would recommend either using python 3, where protobuf works just fine. You can also as a workaround for the time being add the following code to the module level of your app.py which should also fix the problem.

import site
site.addsitedir('/var/task')

I am going to close this issue, feel free to follow up in either of the new issues. Thanks for the detailed reports!

@kislyuk
Copy link
Contributor Author

kislyuk commented Aug 29, 2017

Thanks. To clarify, we use Python 3.6 only. The issue that I saw with protobuf had to do with protobuf halting the pip download process when PATH is not set, and thereby silently aborting the download of all subsequent wheels, ending up with a deployment with incomplete dependencies. Fixing #500 will make the deploy fail correctly in this situation instead of succeeding. #501 appears to be the root cause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants