Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spacy: added Explosion AI spaCy, thinc and related tools #25791

Closed
wants to merge 8 commits into from

Conversation

aptlin
Copy link

@aptlin aptlin commented May 15, 2017

Motivation for this change

Making NLP analysis easier with Nix.

Things done
  • Tested using sandboxing
    (nix.useSandbox on NixOS,
    or option build-use-sandbox in nix.conf
    on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • Linux
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Fits CONTRIBUTING.md.

@mention-bot
Copy link

@sdll, thanks for your PR! By analyzing the history of the files in this pull request, we identified @zimbatm, @zraexy, @vcunat and @FRidh to be potential reviewers.

Copy link
Member

@FRidh FRidh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Looks good overall
  • Remove meta.platforms
  • Preferably use fetchPypi
  • Do not commit whitespace changes in python-packages.nix

meta = with stdenv.lib; {
description = "Cython memory pool for RAII-style memory management";
homepage = https://github.com/explosion/cymem;
platforms = platforms.unix;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't set platforms in Python packages. The function already takes care of that.


buildInputs = [
pytest
nose
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs pytest and nose, but uses unittest as runner? That's odd.

name = "ftfy-${version}";
version = "4.4.2";

src = fetchurl {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fetchPypi

};

propagatedBuildInputs = [
cython
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a runtime dependency, or is it?

@aptlin
Copy link
Author

aptlin commented May 15, 2017

@FRidh, thank you very much for your review!

I have made requested changes and cleaned up the code, removing whitespace trimming in python-packages.nix.

With regards to murmurhash, cython is a build dependency, as you can see here.

@FRidh
Copy link
Member

FRidh commented May 15, 2017

With regards to murmurhash, cython is a build dependency, as you can see here.

A requirements.txt file doesn't say much, because its not used when building a package with setuptools, unless its explicitly read and added to e.g. install_requires. Looking at the setup.py file it seems it is only required to run setup and perform the build.

If a package is only used during build and install time, then it should be added to buildInputs. If it is needed at runtime or when another package that depends on it is build, it should be added to propagatedBuildInputs.

@FRidh FRidh self-assigned this May 15, 2017
@aptlin
Copy link
Author

aptlin commented May 15, 2017

@FRidh, thank you for teaching me to look into the code closer!

I have changed ftfy to make Travis CI builds work for GNU/Linux and OS X, opening an issue about it in @LuminosoInsight's repo here.

I have also changed murmurhash build inputs as requested.

@aptlin aptlin changed the title spacy: added Explosion AI spaCy, thinc and related tools spacy: added @explosion AI spaCy, thinc and related tools May 15, 2017
@aptlin aptlin changed the title spacy: added @explosion AI spaCy, thinc and related tools spacy: added Explosion AI spaCy, thinc and related tools May 15, 2017
@aptlin
Copy link
Author

aptlin commented May 15, 2017

@FRidh, I moved recipes to satisfy the 'package/default.nix' format, as well as replaced with rec {....} with let ... in.

OS X Travis seems to time out, could you please tell me how I can decrease the build time? My first idea was to set doCheck = false; in some recipes exclusively for OS X, is there a way to do it?

@FRidh
Copy link
Member

FRidh commented May 15, 2017

@sdll you can ignore Travis. While it is sometimes useful, we cannot really rely on it.

@aptlin
Copy link
Author

aptlin commented May 15, 2017

@FRidh, is there anything else I need to change? Is the tag "needs: clean-up" still relevant?

@aptlin
Copy link
Author

aptlin commented May 15, 2017

@FRidh, I have updated the recipe for ftfy, changing its version to 4.4.3, the testing issue was fixed (see rspeer/python-ftfy#75)

I have also added pip as a dependency for spaCy, because its download function would not work without it, adding the request (explosion/spaCy#1063) to fix the code to account for the read-only status of the package home directory.

@aptlin
Copy link
Author

aptlin commented May 17, 2017

@FRidh,

Travis CI on OS X seems to fail again due to the same error as in #25804:

pyemd/emd.cpp:458:10: fatal error: 'vector' file not found

Similarly, do you think we should try to fix it by specifying GCC_INSTALL_HOME?
See a similar question discussed here: harvard-acc/LLVM-Tracer#7

Is there a deeper problem with how clang c++ headers are treated in the OS X build of Nix? Shall I file an issue?

@FRidh
Copy link
Member

FRidh commented Jul 28, 2017

@knedlsepp is this related to #27649? (I'm just diving through old PR's)

@knedlsepp
Copy link
Member

@FRidh: Seems highly likely. I'll try to rework my PR asap, so we can merge.

@FRidh
Copy link
Member

FRidh commented Aug 13, 2017

This needs a minor rebase. I successfully built the 2.7 version of spacy but the 3.6 hang during building.

@FRidh
Copy link
Member

FRidh commented Aug 13, 2017

I've pushed this to master.

@vcunat vcunat closed this Aug 19, 2017
@JonathanReeve
Copy link
Contributor

Great work! Any idea how to install the data using Nix? (E.g. en_core_web_lg?)

@Janik-Haag Janik-Haag added the 12. first-time contribution This PR is the author's first one; please be gentle! label Jun 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: python 8.has: package (new) This PR adds a new package 9.needs: clean-up 12. first-time contribution This PR is the author's first one; please be gentle!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants