Release 0.6.4
This release introduces a massive update to the framework with new internal design and additional functionality. With this release the long broken support for Python 2 is abandoned and all future releases will be aimed at Python 3 only starting from 3.6 version.
New models and additional functionality
- New Kernelized Probabilistic MF model.
- Built-in support for scaled version of PureSVD (see Reproducing EIGENREC results tutorial for details).
- Simple hybrid model that uses feature-similarity scores aggregation.
- Baseline models for item cold start regime: popularity-based, random, similarity-aggregation model, PureSVD.
- New classes to support item post-filtering.
- Unified handling of side feature-based relations.
- Support for several learning-rate schedules in SGD: adagrad, adam, rmsprop + my own 3 heuristic schedules adanorm, gnprop and gnpropz.
Hyper-parameter tuning
- Generic
find_optimal_config
function to perform random grid search over user-defined hyper-parameter space. - New
find_optimal_svd_rank
routine to quickly and efficiently tune SVD. - New
find_optimal_tucker_ranks
routine to quickly and efficiently tune tensor-based models. - User can now define, which configurations to skip from random grid search.
Evaluation
- New versatile
run_cv_experiment
routine to automate cross-validation experiments. Supports both the default and the user-defined evaluation protocols. - More ways to evaluate against the specific set of metrics supported by Polara.
Performance improvements
- Efficient handling of indices in
LightFM
model (allows to reduce memory load by orders of magnitude comparing to native LightFM implementation). - Rating prediction with tensor-based model is now more efficient.
- Computation of Tucker core in tensor-based models is now optional.
Other improvements
- Revived
Turi Create
(ex Graphlab Create) support with its factorization models includingFactorization Machines
. - Refactored evaluation code.
- Refactored and improved code for SGD-based matrix factorization. Now supports both naive and probabilistic implementations.
- Improved handling of sparse operations.
- Better handling of side features.
- Improved timing functionality.
- Internal naming is now more consistent.
- Support for
Amazon
andEpinions
datasets - Allow unpacking the probe part of the
Netflix
dataset. - Some other minor improvements and fixes.