Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

try dask.array #71

Merged
merged 7 commits into from
Nov 20, 2021
Merged

try dask.array #71

merged 7 commits into from
Nov 20, 2021

Conversation

aaronspring
Copy link
Contributor

@aaronspring aaronspring commented Nov 18, 2021

setup.py Outdated Show resolved Hide resolved
Co-authored-by: Ryan Abernathey <[email protected]>
@codecov
Copy link

codecov bot commented Nov 18, 2021

Codecov Report

Merging #71 (75f9184) into master (e01291b) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master      #71   +/-   ##
=======================================
  Coverage   97.18%   97.18%           
=======================================
  Files           2        2           
  Lines         249      249           
  Branches       71       71           
=======================================
  Hits          242      242           
  Misses          5        5           
  Partials        2        2           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e01291b...75f9184. Read the comment docs.

Copy link
Contributor

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aaronspring! FWIW I'm not sure this will make an large impact on what dependencies are installed. Dask's required dependencies are listed here

https://github.com/dask/dask/blob/c19a8300d514a8ee488e0da5e81c01155d55fdbe/setup.py#L32-L39

which are always installed, regardless if one pip installs dask, dask[array], dask[array, dataframe], dask[complete], etc.

Adding the [array] extra to the dask install will cause numpy to also be installed

https://github.com/dask/dask/blob/c19a8300d514a8ee488e0da5e81c01155d55fdbe/setup.py#L13

This is probably a good thing as it will require xhistogram to install a numpy version which is compatible with dask.array, but will, as of today, not result in a big difference since numpy is listed as a separate dependency already.

That's all to say that this change seems totally reasonable, but isn't the cause for libraries like bokeh being installed as referenced over in #70

@aaronspring
Copy link
Contributor Author

thanks @jrbourbeau for the quick description

ok. so there is no way around install_requires and the other extras (bokeh in diagnostics) were not even installed before.

so this PR here would only ensure that a matching numpy version is used @rabernat but xarray already kind of does that by itself. should this PR still go forward?

@@ -4,7 +4,7 @@ channels:
dependencies:
- python=3.7
- xarray
- dask
- dask[array]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see what the issue might be now. dask[array] (and my comment here #71 (review)) is just a pip thing and isn't supported when installing dask with conda. conda install dask will pull in other packages like bokeh though. Specifically the packages listed here.

If you just want to conda install the minimum required dependencies for Dask (which would be equivalent to pip install dask), you'll want to install the dask-core package from conda-forge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aaronspring
Copy link
Contributor Author

aaronspring commented Nov 19, 2021

thanks for the explanation. the originating issue was xarray-contrib/xskillscore#359, then we discovered that xhistogram relies on dask.

also xhistogram on conda-forge relies on dask https://github.com/conda-forge/xhistogram-feedstock/blob/master/recipe/meta.yaml - not sure whether there also specifying dask.array works, but you said dask[array] is a pip thing, so likely not, but we could use dask-core on the next release maybe

@aaronspring aaronspring marked this pull request as ready for review November 19, 2021 01:12
@rabernat
Copy link
Contributor

If we are happy about this I will merge. 👍 if you think we are done here.

@aaronspring
Copy link
Contributor Author

happy to merge and maybe we can ship the conda version with dask-core, see conda-forge/xhistogram-feedstock#9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

make dask.array dependency not dask to speedup pip install
3 participants