Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added packaging support for whl files. #388

Merged
merged 6 commits into from
Jul 5, 2017

Conversation

stealthycoin
Copy link
Contributor

@stealthycoin stealthycoin commented Jun 26, 2017

The algorithm for building a lambda compatible bundle is as follows:

  1. Download all packages using pip download without specifying binary
    preference. This allows it to download whl files where possible and
    sdists otherwise.
  2. Try to download lamba-compatible whl files for all sdists
    and lambda-incompatible whl files that were downloaded in the first
    step. Once this step is completed we have all the whl files are are
    going to be able to get from PyPi that are compatible with lambda.
  3. Any sdists left over without a corresponding compatible whl file will
    now be built to a whl. If this results in an incompatible whl file there
    is nothing more we can do to get a compatible one and it is up to the
    user to vendor that package themselves.

Now that we have done our best to get a whl file for each dependency, we
do a final pass over the directory and install all compatible whl files
to the lambda package. Any left over dependencies that do not have a
compatible whl file are reported to the user, as they will need to be
manually added to the vendor directory before deployment.

Virtualenv is no longer used so the compat functionality for it is no
longer needed.

Since the packaging process now downloads whl files which vary by python
version the deployment process now rebuilds the deployment package if it
detects that the python version has changed.

Old PR which stopped sycing with my fork branch for some reason: #382

closes: #249 #227 #364

cc @jamesls @kyleknap @dstufft

@stealthycoin stealthycoin force-pushed the whl-packaging branch 2 times, most recently from 73add2c to 12ce5f3 Compare June 26, 2017 18:16
@codecov-io
Copy link

codecov-io commented Jun 26, 2017

Codecov Report

Merging #388 into master will increase coverage by 0.79%.
The diff coverage is 99.68%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #388      +/-   ##
==========================================
+ Coverage   92.75%   93.55%   +0.79%     
==========================================
  Files          18       18              
  Lines        2569     2807     +238     
  Branches      334      369      +35     
==========================================
+ Hits         2383     2626     +243     
+ Misses        137      133       -4     
+ Partials       49       48       -1
Impacted Files Coverage Δ
chalice/package.py 97.59% <ø> (ø) ⬆️
chalice/deploy/deployer.py 90.37% <100%> (ø) ⬆️
chalice/constants.py 100% <100%> (ø) ⬆️
chalice/compat.py 100% <100%> (+28%) ⬆️
chalice/utils.py 95.23% <100%> (+3.71%) ⬆️
chalice/deploy/packager.py 95.98% <99.62%> (+5.15%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2fa94a1...4822615. Read the comment docs.

class DependencyBuilder(object):
"""Build site-packages by manually downloading and unpacking whls.

Pip is used to download all the dependency sdists. Then wheels that
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that compatible -> that are compatible

Copy link
Contributor

@kyleknap kyleknap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just had some comments/questions. I really only looked through the changes posted in the last two commits from your closed PR since I had looked at the other ones before.

# type: (Config, str, str) -> bool
if not self._osutils.file_exists(deployment_package_filename):
return False
lambda_config = self._aws_client.get_function_configuration(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can avoid making this extra GetFunctionConfiguration call? It looks like early there is similar logic: https://github.com/awslabs/chalice/blob/master/chalice/deploy/deployer.py#L499 that we could possibly reuse to save us the extra API call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looked like it would require too much refactoring of things not related to my PR so I didn't but yes that call is made twice which is inefficient.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not to make the deployment slower if there's a reasonable workaround here. If you can't do Kyle's suggestion, can we not just add the python version to the deployment package filename? That should avoid a get_function_configuration call with the small cost of being unable to detect out of band (from chalice deploy) changes, which I'm sure will already break the deploy process.

def _install_purelib_and_platlib(self, whl, root):
# type: (Package, str) -> None
# Take a wheel package and the directory it was just unpacked into and
# properly unpackage the purelib and platlib subdirectories if they
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to explain what "properly" unpackage entails either by expanding the comment or by breaking this function into small internal methods to help explain what each step is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll reword the comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will also be nice to explain the what and why.

return name, version


class PipWrapper(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the reasoning to have this abstract class when it is subclassed only once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I subclassed it for testing purposes as well and wanted to ensure that the type signature contract is not broken by the testing class. This way mypy will catch it if either of them get out of sync. I could also move the logic from the SubprocessPip up to PipWrapperand just have the testing inherit from PipWrapper and override the main method. I thought it was clearer what the purpose of everything was this way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the typing doesn't actually help the way I thought it did, so I'm not sure exactly how to accomplish what I was trying for here.

"""Execute a pip command with the given arguments."""
main_args = [command] + args
_, err = self._wrapped_pip.main(main_args)
if err:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it could be potentially flaky in terms of check the standard error for if there is an error. Is there any reason why we just don't use check_call and provide subprocess.PIPE for standard out/error or we just check the rc of the subprocess? That seems it would be a bit more reliable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it will fail if it fails to get packages, which is not uncommon for smaller simple pure python packages. That is too be expected and not an error. All I really care about is if pip throws us a stack trace of some kind via stderr, and the rc does not vary for that so I can't sense what kind of error it was without parsing stderr.

I talked to @dstufft about this and if I remember correctly this was what we arrived at as the best solution for catching these types of issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offline discussion: Need to talk with donald about this more in detail. But the initial download case can probably rely on rc, there rest will need to tolerate an rc != 1 so we are stuck parsing the stderr.

# type: (str) -> Iterator[Tuple[str, List[str], List[str]]]
return os.walk(path)

def copytree(self, source, destination):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we just do:

shutil.copytee(source, destination)

It seems that it will handle individual files and directories for you.

Copy link
Contributor Author

@stealthycoin stealthycoin Jun 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/python/cpython/blob/master/Lib/shutil.py#L315 which fails if the directory already exists, which in our case it always will. copy/copy2 clobber files and take the directory itself so they are no good either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah copytree is pretty annoying like that.

packages=[
'foo-1.2-cp36-cp36m-manylinux1_x86_64.whl'
],
whl_contents=['{data_dir}/purelib/foo/placeholder']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The {data_dir} part is a little confusing for me. What does that represent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data directory in an unpacked wheel file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offline discussion: Those were there for the implicit case where I do not supply wheel contents and I need to calculate the data directory name anyway so I reused that mechanism. Its clearer to write it out in the explicit tests so I will change it to do that rather than rely on the format string.

packages=[
'foo-1.2-cp36-cp36m-manylinux1_x86_64.whl'
],
whl_contents=['{data_dir}/platlib/foo/placeholder']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I am not very sure what the placeholder part of the string for whl_contents represents. Could you elaborate on what the placeholder is for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just need a garbage file in the directory so that folder exists. Maybe I'm conflating how zip works with how git works. I didn't try it without the placeholders.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offline discussion: removed the placeholder, I didn't realize empty folders worked in zip/tar for some reason.

Copy link
Member

@jamesls jamesls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been trying this out locally and it's worked really well so far. I had some comments about the APIs and how the code is structured. Haven't had a chance to go through all the tests yet but wanted to get some early feedback first.

# type: (Config, str, str) -> bool
if not self._osutils.file_exists(deployment_package_filename):
return False
lambda_config = self._aws_client.get_function_configuration(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not to make the deployment slower if there's a reasonable workaround here. If you can't do Kyle's suggestion, can we not just add the python version to the deployment package filename? That should avoid a get_function_configuration call with the small cost of being unable to detect out of band (from chalice deploy) changes, which I'm sure will already break the deploy process.

chalice/utils.py Outdated
os.makedirs(destination)
names = os.listdir(source)
for name in names:
new_source = os.path.join(source, name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should try to use the osutils methods where possible. This makes it easier to write alternate implementations without having to implement all the methods. So self.joinpath here instead of os.path.join.

chalice/utils.py Outdated

@contextlib.contextmanager
def tempdir(self):
# type: () -> Any
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we can restrict the type further than Any.

setup.py Outdated
'typing==3.5.3.0',
'six>=1.10.0,<2.0.0',
'pip>=9'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should at least bind to a major version, so 9.x only.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit tricky to do with pip I think, because while we're not setuptools style, we do rev our versions fairly regularly, but our breaking changes tend to be removing things that few people use rather than completely changing the common paths. So while we could do this, we should expect to need to rev this once or twice a year.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once or twice a year seems reasonable, I'm just concerned that it's still a possibility within semver to break a common API (even if it's rare), and people on older versions of chalice won't be guarded against this.

virtualenv.create_environment(venv_dir)
def __init__(self, dependency_builder=None):
# type: (Optional[DependencyBuilder]) -> None
self._osutils = OSUtils()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The common pattern we use is that either the object that requires the dependency is responsible for defaulting it, or the very top level code (factories/entry points etc) constructs the dependency. So I'd suggest moving the OSUtils() construction all the way up to the factory function, or down to DependencyBuilder (not my preference).

# Sort the downloaded packages into three categories:
# - sdists (Pip could not get a wheel so it gave us a sdist)
# - valid whls (lambda compatbile wheel files)
# - invalid whls (lambda incompatible wheel files)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just call it that (lambda_incompatible_wheels) rather than have a comment about what an "invalid whl" is?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just compatible / incompatible if you want to keep it shorter. I think there's enough context for that to make sense.

valid_whls, invalid_whls = self._categorize_whl_files(directory)
sdists = deps - valid_whls - invalid_whls

# Find which packages we do not yet have a valid whl file for. And
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've made this comment before, but rather than explaining what the code is doing, it would really helpful if you explain why you're doing this. As someone not as familiar with the intricacies of pip, a high level overview of why we're doing this multipass approach, why an invalid wheel might be able to be "converted" to a valid manylinux1 wheel, etc. Putting a more detailed explanation of your algorithm outlined in the PR description would be better in the code itself IMO.

# type: (str, str, Set[Package]) -> None
if os.path.isdir(dst_dir):
shutil.rmtree(dst_dir)
os.makedirs(dst_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're introducing OSUtils as a dependency for this object, the point of that is it abstracts all os operations to allow for granular testing, error conditions, etc. Using shutil and os directly in the same object defeats that purpose.

class PipWrapper(object):
def main(self, args):
# type: (List[str]) -> Tuple[Optional[bytes], Optional[bytes]]
raise NotImplementedError('PipWrapper.main')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we usually just use the method name and not the base class. If a subclass doesn't implement this they'll see the base class's name in the error message which can be confusing.

arguments = ['-r', requirements_file, '--dest', directory]
self._execute('download', arguments)

def download_manylinux_whls(self, packages, directory):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick, but it's not 100% consistent in this code if we use "wheel" or "whl" (build_wheel vs. download_manylinux_whls. FWIW, I prefer "wheel", it's two extra characters, makes it more readable, and is the name of the concept, whereas "whl" is the file extension.

@JordonPhillips JordonPhillips self-requested a review June 29, 2017 22:41
Copy link
Member

@JordonPhillips JordonPhillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better descriptions. Had a pass at the code itself but not the tests yet.


Your deployment will continue but may not work correctly
if missing dependencies are not present. For more information:
http://chalice.readthedocs.io/en/latest/topics/packaging.html
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably update this to include information about this feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got a draft up.

dependency_builder = DependencyBuilder(self._osutils)
self._dependency_builder = dependency_builder

def _get_requirements_file(self, project_dir):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

into a site-packages directory, to be included in the bundle by the
packager.
"""
_MANYLINUX_COMPATIBLE_PLAT = {'any', 'linux_x86_64', 'manylinux1_x86_64'}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

# which will serve as the master list of dependencies needed to deploy
# successfully.
self._pip.download_all_dependencies(requirements_file, directory)
deps = {Package(directory, filename) for filename
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally a project where we can use dict comprehensions

# Sort the downloaded packages into three categories:
# - sdists (Pip could not get a wheel so it gave us a sdist)
# - valid whls (lambda compatbile wheel files)
# - invalid whls (lambda incompatible wheel files)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just compatible / incompatible if you want to keep it shorter. I think there's enough context for that to make sense.

p = subprocess.Popen(cmd, cwd=package_dir,
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
p.communicate()
self._osutils.joinpath(egg_info_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't being used, and only has one argument anyway which I'm not sure makes sense.

p.communicate()
self._osutils.joinpath(egg_info_dir)
info_contents = self._osutils.get_directory_contents(egg_info_dir)
assert len(info_contents) == 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I'm not sure the user experience will be great if this condition fails.


class PipRunner(object):
"""Wrapper around pip calls used by chalice."""

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to nitpick, but this is the only place you've got a space between the class doc string and the first method.

with tarfile.open(tarfile_path, 'r:gz') as tar:
tar.extractall(unpack_dir)

def directory_exists(self, path):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should probably stay close to the names of the functions they're aliasing.

# type: (str) -> Iterator[Tuple[str, List[str], List[str]]]
return os.walk(path)

def copytree(self, source, destination):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah copytree is pretty annoying like that.

Copy link
Member

@jamesls jamesls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had some minor feedback. I think this latest version is looking good. Only major piece missing is updated packaging docs.

chalice/utils.py Outdated
@@ -92,6 +98,15 @@ def get_file_contents(self, filename, binary=True):
with open(filename, mode) as f:
return f.read()

@contextlib.contextmanager
def get_file_context(self, filename, mode='r'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? Is there a reason you can't use OSUtils.open directly?

chalice/utils.py Outdated
@@ -101,6 +116,84 @@ def set_file_contents(self, filename, contents, binary=True):
with open(filename, mode) as f:
f.write(contents)

def unpack_zipfile(self, zipfile_path, unpack_dir):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is picky, but I think we should use verbs that are commonly associated with whatever the noun is, so for a zipfile, I would expect "unzip", or "extract" but not "unpack".

chalice/utils.py Outdated
z.extractall(unpack_dir)

@contextlib.contextmanager
def zipfile_context(self, filename, mode='r',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to wrap the zipfile with its own context manager? This is already a capability of zipfile.ZipFile so it seems unnecessary.

chalice/utils.py Outdated
finally:
z.close()

def unpack_tarfile(self, tarfile_path, unpack_dir):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as zip. Maybe "untar" or "extract"

setup.py Outdated
'typing==3.5.3.0',
'six>=1.10.0,<2.0.0',
'pip>=9'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once or twice a year seems reasonable, I'm just concerned that it's still a possibility within semver to break a common API (even if it's rare), and people on older versions of chalice won't be guarded against this.

raise InvalidSourceDistributionNameError(sdist_path)
# There should only be one directory unpacked.
contents = self._osutils.get_directory_contents(unpack_dir)
assert len(contents) == 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I may have either added this or a similar assertion somewhere in the packager code. I would also suggest removing these assertions. The only reason I added them previously was during the initial POC and they just never got removed. They generate unhelpful error messages. If this is a case a customer can run into, raise a specific exception, otherwise remove the assertion.

try:
requirements_filepath = self._get_requirements_filename(
project_dir)
site_packages_dir = self._osutils.joinpath(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this directory not need to be python version aware? How does it handle the case where the python version is switched and different requirements are specified? Wouldn't it be safer to just have a separate py-specific site-packages directory to avoid any possibility of invalid packages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Site packages are not reused, I was considering deleting it, the only reason its not deleted was for debugging purposes so I could see what the folder looked like, its convenient if something goes wrong to be able to see what was packaged before and after vendoring.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed that that happens in _install_wheels. While it seems a little surprising that a method like build_site_packages will wipe out the target_directory if it exists (arguably even dangerous), seems like my original concern isn't an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose there are three options:

  • Delete it first and rebuild which is safe with regards to building a working site-packages directory. Not safe if contents was pre-merged in rather than post-merged in like we do currently.
  • Merge content. Not safe since python versions can change or get removed.
  • Error out.

So either:

  • Leave the code as is.
  • I leave the directory there after its built, but delete it in the caller of build_site_packages and have the it error out if the directory exists.
  • Delete it after the package finishes building, and have build_site_packages error out if the directory already exists.

My preference is option 1 or 2 since I like having directory there "just in case" you need to investigate it for some reason.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with option 2. That allows build_site_packages to just focus on populating the directory and it's the caller's responsibility to put it into whatever state it needs to be (empty in this case).

Though my personal preference would be to just delete the directory. It's of no use to chalice users (we don't reuse the directory across deploys) and it just takes up space. If we want dev debugging hooks, we can always add them later and explicitly turn them on (or ask users to run with debugging mode enabled).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can just nuke the directory and we can allow debugging options later. The debug hooks would need to give you access to it rather than just logging it though since sometimes you need to go look in files and go pretty deep into nested directories, more than a simple log will give.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that seems fine. Though in that case you can just use a OSUtils.tempdir so it can handle cleaning up the directory when you're done.

Copy link
Contributor Author

@stealthycoin stealthycoin Jul 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep thats the plan. The debug hook can copy it out somewhere before it gets erased.

@stealthycoin stealthycoin force-pushed the whl-packaging branch 2 times, most recently from 9c217f1 to d89826c Compare July 3, 2017 23:25
Copy link
Member

@jamesls jamesls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs looks good, just had a few suggestions.

run ``pip install -r requirements.txt`` in a virtual environment
and automatically install 3rd party python packages into the deployment
package.
install any packages it find or build compatible wheels for. Specifically
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the wording is off here. Did you mean "...install any packages it finds or builds compatible wheels for" maybe?

and automatically install 3rd party python packages into the deployment
package.
install any packages it find or build compatible wheels for. Specifically
all pure python projects as well as all projects that upload wheel files for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about packages instead of projects?

Copy link
Contributor

@kyleknap kyleknap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good. But one general comment that I have is that it would be nice if the literal commands that need to be ran were present (sort of what the pyscopg2 example had). It is pretty convenient that I can blindly copy and paste the commands I need to run instead of read the entire document and try to interpret the command that I need to run.

`cryptography <https://pypi.python.org/pypi/cryptography>`__ package in a
Chalice app for the ``python3.6`` lambda environment.

We're going to leverage the ``vendor/`` directory in order to use this package
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth making the mention that since chalice will try to build the wheel for you and you are on the right system, it may be able to work via the requirements.txt. I am only suggesting it because it came off to me as cryptography will require additional steps to be part of the deployment package.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about just saying something like "Suppose we're on a Mac and we want to deploy an app that uses cryptography". That keeps it more user-centric and less about implementation details of chalice building wheels. You could also say that this wouldn't be an issue if we were on the same platform that Lambda uses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be fine as well. I just want to make sure a user sees this and does not immediately start trying to follow the steps when it actually would have worked for them to start.

Copy link
Member

@JordonPhillips JordonPhillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, docs look good, just a pair of very minor comments on some of the tests.

def _build_fake_sdist(self, filepath):
# tar.gz is the same no reason to test it here as it is tested in
# unit.deploy.TestSdistMetadataFetcher
assert filepath.endswith('zip')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.zip

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thought I caught all those.

installed_packages = os.listdir(site_packages)

pip.validate()
for req in reqs:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's only one of these so it's a bit odd to iterate. This happens in a few other places as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes it easy to update the tests if I need to.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm -0 on keeping them. This is not a PR blocker.

Copy link
Member

@jamesls jamesls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, aside from pending feedback from Kyle/Jordon.

Copy link
Contributor

@kyleknap kyleknap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! 🚢

jcarlyl added 6 commits July 5, 2017 13:38
The algorithm for building a lambda compatible bundle is as follows:

1) Download all packages using pip download without specifying binary
preference. This allows it to download whl files where possible and
sdists otherwise.
2) Try to download lamba-compatible whl files for all sdists
and lambda-incompatible whl files that were downloaded in the first
step. Once this step is completed we have all the whl files are are
going to be able to get from PyPi that are compatible with lambda.
3) Any sdists left over without a corresponding compatible whl file will
now be built to a whl. If this results in an incompatible whl file there
is nothing more we can do to get a compatible one and it is up to the
user to vendor that package themselves.

Now that we have done our best to get a whl file for each dependency, we
do a final pass over the directory and install all compatible whl files
to the lambda package. Any left over dependencies that do not have a
compatible whl file are reported to the user, as they will need to be
manually added to the vendor directory before deployment.

Virtualenv is no longer used so the compat functionality for it is no
longer needed.

Since the packaging process now downloads whl files which vary by python
version the deployment process now rebuilds the deployment package if it
detects that the python version has changed.
Interface for creating a site packages directory from a requirements
file has changed to no longer make any assumptions about the chalice
directory structure. The python version was appended to the end of a
package name to prevent needing to query for the last deployed python
version twice.
Updated the packaging documentation and addressed some minor feedback
items.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Try to autoinstall manylinux1 wheels when possible
6 participants