Skip to content

Releases: KxSystems/ml

4.1.0

11 Nov 12:15
5509fa6
Compare
Choose a tag to compare

ML Registry Functionality: A location for the storage and versioning of ML models on-prem along with a common model retrieval API allowing models regardless of underlying requirements to be retrieved and used on kdb+ data. This allows for enhanced team collaboration opportunities and management oversight by centralising team work to a common storage location.

4.0.0

09 Oct 11:31
7657d99
Compare
Choose a tag to compare

The release of ML toolkit 4.0 comes with several key changes, enhancements and improvements:

  • Unified Codebase: Migrated other components of the ML toolkit (NLP & AutoML) into the same repository for improved code sharing and maintainability.
  • PyKX Support: NLP, ML and AutoML will now use PyKX if available, otherwise reverting to embedPy.
  • Python Dependency Updates: Added support for python 3.11, and removed several dependency version pins & limits to ensure compatibility and improved performance.
  • Enhanced Testing & CI: Improved internal testing and continuous integration systems, ensuring better reliability for future releases. Includes automated Snyk scans for enhanced security.
  • Multi-Processing Support Fix: Resolved issues with multi-processing support, providing more robust and efficient parallel processing capabilities.
  • Examples Provided: Comprehensive examples and associated sample output reports are now available under examples/. These examples offer practical use cases and demonstrate the new features and improvements.

3.2.0

07 Jun 12:17
439f70b
Compare
Choose a tag to compare
  • Fix to issues relating to unsupported versions of scipy
  • Updates to tests no-longer supported by the Python equivalent functions

3.1.0

15 Nov 13:44
5e33440
Compare
Choose a tag to compare
  • Update to FRESH functionality to be more efficient in distributed applications
  • Fix to df2tab to handle nulls appropriately in date columns
  • Fix to tsPlot functionality

Addition of stats library in tgz releases

29 Jul 13:42
6f4d551
Compare
Choose a tag to compare
Addition of stats library to packaged release (#95)

* Addition of stats folder for .tgz releases

* length update for FRESH functionality

3.0.1

30 Jun 11:01
ae11d47
Compare
Choose a tag to compare

Addition of stats library for docker image deployment

3.0.0

10 Mar 14:44
5781cd1
Compare
Choose a tag to compare
  • Refactor coding/commenting style to be up to date with coding standards
  • Addition of stats section. This includes functionality such as
    - OLS/WLS fit/predict functionality
    - Transfer of percentile/describe function from utility folder to stats folder
    - Expansion of the .ml.describe` function to allow users more flexibility by having a user configurable json file
  • Change function names to camel case. Any functions that were affected by this change are defined within functionMapping.json. These functions are still callable until the next release of the ML Toolkit. If the old versions are called a warning message will be sent to stdout
  • Scaling and transformation preprocessing functions were amended to now contain a fit/transform/fitTransform key. Any functions affected by this changed are defined within functionMapping.json. These functions are still callable until the next release of the ML Toolkit. If the old versions are called a warning message will be sent to stdout.
  • All functions containing a predict/update/transform key as output, must now take config as the initial input which is of type dictionary and has a modelInfo key
  • The contents within Freshs' hyperparam.txt file were converted to a json file hyperparameters.json
  • The utility functions within the clustering library were moved to clust/utils.q
  • init.q can now be loaded before initialization of ml.q
  • All README files were updated to reflect that the toolkit is not in its BETA release stages
  • Test script was added to check that length of code in files did not exceed 80 chars filelength.t
  • Tests are now run in appveyor/travis by calling testFiles.bat. This will be updated when any new test folder is added to the toolkit
  • All tests were updated to reflect these changes

2.0.0

04 Jan 09:11
2540ade
Compare
Choose a tag to compare

What’s New:

Time series functionality:

  • Addition of time series models implemented in q
    • AR, ARMA, ARIMA, SARIMA and ARCH.
  • Time series feature engineering techniques (windowed and lagged feature generation.
  • Data stationarity testing

Graph/pipeline resources:

  • Framework for the development of modularised kdb+ workflows and executable pipeline structures

Optimization:

  • Implementation of the Broyden-Fletcher-Goldfarb-Shanno algorithm for function minimization

Grid Search:

  • Random and pseudo random (Sobol) number generated parameter set functionality providing an alternative to exhaustive grid search.

Clustering:

  • Implementation of k-means clustering now uses early stopping

Updates:

Clustering:

  • Fit / predict / update style function calls rather than just fit+predict as previously to allow models to be deployed for classification on incoming data.

Initial release candidate for version 2.0.0 (update)

12 Oct 17:00
41c66e7
Compare
Choose a tag to compare

Additive update, including clustering updates

  • Fit / predict / update style function calls rather than just fit+predict as previously to allow models to be deployed for classification on incoming data.

Initial release candidate for version 2.0.0

06 Oct 18:24
09ad749
Compare
Choose a tag to compare

What’s New:
Time series functionality:

  • Addition of time series models implemented in q
    • AR, ARMA, ARIMA, SARIMA and ARCH.
  • Time series feature engineering techniques (windowed and lagged feature generation.
  • Data stationarity testing

Graph/pipeline resources:

  • Framework for the development of modularised kdb+ workflows and executable pipeline structures

Optimization:

  • Implementation of the Broyden-Fletcher-Goldfarb-Shanno algorithm for function minimization

Grid Search:

  • Random and pseudo random (Sobol) number generated parameter set functionality providing an alternative to exhaustive grid search.
    Clustering:
  • Implementation of k-means clustering now uses early stopping

Updates:

Clustering:

  • Fit / predict / update style function calls rather than just fit+predict as previously to allow models to be deployed for classification on incoming data.