Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot import azure-datalake-store in python <3.3 if no other azure library is installed #236

Closed
VincentBLortie opened this issue Aug 27, 2018 · 11 comments

Comments

@VincentBLortie
Copy link

Description

When azure-datalake-store is the only azure package installed using pip in a python environment with version < 3.3, azure.datalake.store cannot be imported.

(My guess is that unlike other azure packages, azure-datalake-store does not depend on azure-nspkg)


Reproduction Steps

  1. Create a new environment with a version of python earlier than 3.3 and activate that environment

Example:

conda create --name p27 python=2.7
source activate p27
  1. Install azure-datalake-store and nothing else: pip install azure-datalake-store

  2. Attempt to import azure.datalake.store: python -c "import azure.datalake.store"

The output is:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: No module named azure.datalake.store

Environment summary

SDK Version: What version of the SDK are you using? (pip show azure-datalake-store)

Name: azure-datalake-store
Version: 0.0.29
Summary: Azure Data Lake Store Filesystem Client Library for Python
Home-page: https://github.com/Azure/azure-data-lake-store-python
Author: Microsoft Corporation
Author-email: [email protected]
License: MIT License
Location: /Users/vincentlortie/miniconda2/envs/p27/lib/python2.7/site-packages
Requires: futures, pathlib2, adal, cffi
Required-by:

Python Version: What Python version are you using? Is it 64-bit or 32-bit?
2.7.15

OS Version: What OS and version are you using?
macOS High Sierra 10.13.6

Shell Type: What shell are you using? (e.g. bash, cmd.exe, Bash on Windows)
zsh

@akharit
Copy link
Member

akharit commented Aug 28, 2018

Root cause is init.py not being present in azure folder of the pip package. Python 3 treats every folder as module, while Python2 requires explicit init.py. So this works only on Python 3 versions.

Installing azure-nspkg will solve the problem, but I am not sure of any other implications of adding it as a dependency.

@lmazuel are there any other things I need to care about if I decide to make azure-nspkg a dependency?

@lmazuel
Copy link
Member

lmazuel commented Aug 28, 2018

@akharit This should be the case already. We used a patch wheel runtime to inject this dependency, and I see azure-nspkg in the configuration:
https://github.com/Azure/azure-data-lake-store-python/blob/master/setup.cfg#L3

So I'm not sure what is exactly happening here, give me a moment to unzip your wheel and see what metadata are actually inside. Maybe your build env is too old for that patch.

@lmazuel
Copy link
Member

lmazuel commented Aug 28, 2018

@akharit If I git clone the repo and build the wheel myself, it works as expected:

> python .\setup.py bdist_wheel
...
> pip install .\dist\azure_datalake_store-0.0.29-py2.py3-none-any.whl
Processing c:\users\lmazuel\git\azure-data-lake-store-python\dist\azure_datalake_store-0.0.29-py2.py3-none-any.whl
Collecting adal>=0.4.2 (from azure-datalake-store==0.0.29)
...
Successfully installed PyJWT-1.6.4 adal-1.0.2 asn1crypto-0.24.0 azure-datalake-store-0.0.29 azure-nspkg-2.0.0 certifi-2018.8.24 cffi-1.11.5 chardet-3.0.4 cryptography-2.3.1 idna-2.7 pycparser-2.18 python-dateutil-2.7.3 requests-2.19.1 six-1.11.0 urllib3-1.23

So my guess is that your build system is too old. Could you tell me how you build the wheels you upload on PyPI? Python version? wheel/setuptools/pip version?
I don't know exactly when it's too old, since I didn't try to downgrade wheel/setuptools/pip until it's broken. Ideally, we do that and we can add a condition in the setup.py to fail if wheel/setuptools/pip is too old, but I can't take the time to do that right now. Easy fix is just to be sure you use latest of everything. I got this installed on my machine when I tested:

  • Python 3.7
  • setuptools 39.0.1
  • pip 10.0.1
  • wheel 0.30.0

@akharit
Copy link
Member

akharit commented Aug 28, 2018

@lmazuel
The wheel is built using travis(https://travis-ci.org/Azure/azure-data-lake-store-python).

Based on the last build logs (https://travis-ci.org/Azure/azure-data-lake-store-python/jobs/419414178)

  • python 3.6.3
  • pip 18.0
  • setuptools 40.2.0
  • wheel 0.31.1

So the versions seem to be up to date. It is quite strange. When I look at the travis log for the last build that deployed, I can see that is explicitly copying init.py to root folder, I do not see a corresponding Adding around line 1012 for that file when it is building the final package for upload. I do see it executing

    for azure_sub_package in folder_with_init:
        init_file = os.path.join(bdist_dir, azure_sub_package, '__init__.py')
        if os.path.isfile(init_file):
            logger.info("manually remove {} while building the wheel".format(init_file))
            os.remove(init_file)

which removes the init file.
https://github.com/Azure/azure-data-lake-store-python/blob/master/azure_bdist_wheel.py#L45

Any ideas?

@VincentBLortie
Copy link
Author

@akharit If it can help, my understanding is that init.py is removed at that line in azure_bdist_wheel.py because it is meant to be provided by azure-nspkg, which should be added as a requirement as a result of lines 35-36:

self.distribution.install_requires.append(
            "{}>=2.0.0".format(self.azure_namespace_package))

@VincentBLortie
Copy link
Author

VincentBLortie commented Aug 28, 2018

I ran python setup.py bdist_wheel -d ../wheel locally using a conda environment identical to what you say travis is using, except for the python version (I used 3.6.6 rather than 3.6.3).

The diff between my whl's METADATA and what is on PyPi is telling:

$ diff mine_METADATA theirs_METADATA
23d22
< Requires-Dist: azure-nspkg (>=2.0.0)

Edit: That is also the only difference between the two whl archives.

@akharit
Copy link
Member

akharit commented Aug 28, 2018

@VincentBLortie Thanks. I think I have figured out why it is different. It seems that the order of distributions parameter in .travis.yml is important.

.travis.yml: line 39 distributions: "sdist bdist_wheel"

Running python setup.py sdist bdist_wheel i.e as it is run currently in pypi module yields a wheel file without azure-nspkg dependency, while running python setup.py bdist_wheel sdist seems to work fine on my system.

Why it happens, I have no idea? As far as my understanding goes, bdist_wheel and sdist are separate operations and there should not be any interaction between them. Not sure why the order matters here.

Anyway let me confirm on some other environments before I push a fix. If you can, do you mind running them both to confirm this behavior?

@VincentBLortie
Copy link
Author

Yup, I get the same result:

diff bdist_wheel_first/azure_datalake_store-0.0.29.dist-info/METADATA sdist_first/azure_datalake_store-0.0.29.dist-info/METADATA
23d22
< Requires-Dist: azure-nspkg (>=2.0.0)

@akharit akharit mentioned this issue Aug 28, 2018
2 tasks
@akharit
Copy link
Member

akharit commented Aug 28, 2018

@VincentBLortie I have created a pr with the 'fix'.

I'll wait for a code review, since I don't really know the real reason why this is occurring. I think I'll be able to push a new release by tomorrow morning or tonight, if there is no other regression.

Thanks for raising the issue and also for the other code you worked on using adls!

@VincentBLortie
Copy link
Author

Awesome, thanks a lot! If you find out why the order matters, I'd love to know.

@lmazuel
Copy link
Member

lmazuel commented Aug 28, 2018

Setuptools is based on buildstep, which technically could be re-used by further build to save some processing time (like Maven for Java or Makefile for C).
For this to work, this assumes that all the plugin you use respect each build step as a black box. This is not always the case unfortunately, and there is issues opened about this, which has various consequences difficult to predict. Example:
pypa/setuptools#1064

Best approach is to build sdist and wheel with a full cleaning in between, to be sure setuptools do not try to re-use something corrupting by a previous plugin. In pratical terms, building wheel before sdist is usually enough, since sdist will invalidate wheel buildsteps and rebuild from scratch (roughly)

akharit added a commit that referenced this issue Aug 29, 2018
* Reversing the parameter order

* Updated version number and history
@akharit akharit closed this as completed Aug 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants