v1.7.5: Bug fixes and new CLI commands
We've been delighted to see spaCy growing so much over the last few months. Before the v1.0 release, we asked for your feedback, which has been incredibly helpful in improving the library. As we're getting closer to v2.0 we hope you'll take a few minutes to fill out the survey, to help us understand how you're using the library, and how it can be better.
📊 Take the survey!
✨ Major features and improvements
- NEW: Experimental
convert
andmodel
commands to convert files to spaCy's JSON format for training, and initialise a new model and its data directory. - Updated language data for Spanish and Portuguese.
🔴 Bug fixes
- Error messages now show the new download commands if no model is loaded.
- The
package
command now works correctly and doesn't fail when creating files. - Fix issue #693: Improve rules for detecting noun chunks.
- Fix issue #758: Adding labels now doesn't cause
EntityRecognizer
transition bug. - Fix issue #862:
label
keyword argument is now handled correctly indoc.merge()
. - Fix issue #891: Tokens containing
/
infixes are now split by the tokenizer. - Fix issue #898: Dependencies are now deprojectivized correctly.
- Fix issue #910: NER models with new labels now saved correctly, preventing memory errors.
- Fix issue #934, #946: Symlink paths are now handled correctly on Windows, preventing
invalid switch
error. - Fix issue #947: Hebrew module is now added to
setup.py
and__init__.py
. - Fix issue #948: Contractions are now lemmatized correctly.
- Fix issue #957: Use
regex
module to avoid back-tracking on URL regex.
📖 Documentation and examples
- Documentation for new
convert
andmodel
commands. - Update troubleshooting guide with
--no-cache-dir
error resulting from outdated pip version and file name shadowing model problem. - Fix various typos and inconsistencies.
👥 Contributors
Thanks to @ericzhao28, @Gregory-Howard, @kinow, @jreeter, @mamoit, @kumaranvpl and @dvsrepo for the pull requests!