You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This project's continuous integration (CI) should include a job which tests that LightGBM model files produced by previous versions can be successfully loaded and used in newer versions.
Specifically, it should test the following claim:
Model files produced in LightGBM version (N).x.x should be readable and usable in all versions in the same major version series.
It should also include tests of expected compatibility between other versions. For example, if 4.0.0 does not include breaking changes to saving / loading of model files, then a test should be added that such a file created in LightGBM 3.2.1 can be loaded in LightGBM 4.0.0.
"model files" refers to the following:
(Python, R, C++) models saved to string using LGBM_BoosterSaveModelToString()
(Python, R, C++) models saved to text file using LGBM_BoosterSaveModel()
(Python) pickled lightgbm.Booster objects (saved with cloudpickle, joblib, or pickle)
[R] .rds files created with saveRDS.lgb.Booster() or saveRDS()
Motivation
LightGBM uses semantic versioning for releases. As a result, users expect that there will not be breaking changes within a major release series. For example, they expect that a Booster saved to a text file using LightGBM 3.1.0 will be readable in any other LightGBM 3.x.x release.
Adding explicit tests on that fact might provide greater confidence that releases are not introducing such changes, and might help to catch issues like #3778 (PR #4056) before they are merged.
This issue has been added to #2302 with other feature requests. I'd like to leave it open for a few days in case others want to add comments, since I just locked discussion on #4228.
After a few days, this issue will be closed until someone leaves a comment saying they'd like to work on it.
Ok now that this has been open for a few days, I am going to close it. If you're reading this and would like to work on this, please comment below and it can be re-opened!
Summary
This project's continuous integration (CI) should include a job which tests that LightGBM model files produced by previous versions can be successfully loaded and used in newer versions.
Specifically, it should test the following claim:
It should also include tests of expected compatibility between other versions. For example, if 4.0.0 does not include breaking changes to saving / loading of model files, then a test should be added that such a file created in LightGBM 3.2.1 can be loaded in LightGBM 4.0.0.
"model files" refers to the following:
LGBM_BoosterSaveModelToString()
LGBM_BoosterSaveModel()
lightgbm.Booster
objects (saved withcloudpickle
,joblib
, orpickle
).rds
files created withsaveRDS.lgb.Booster()
orsaveRDS()
Motivation
LightGBM uses semantic versioning for releases. As a result, users expect that there will not be breaking changes within a major release series. For example, they expect that a Booster saved to a text file using LightGBM 3.1.0 will be readable in any other LightGBM 3.x.x release.
Adding explicit tests on that fact might provide greater confidence that releases are not introducing such changes, and might help to catch issues like #3778 (PR #4056) before they are merged.
References
Created based on #4228 (comment).
See https://lightgbm.readthedocs.io/en/latest/Parallel-Learning-Guide.html#saving-dask-models for some documentation that explains the different ways that one type of LightGBM model object (Dask estimators in Python) can be saved.
saveRDS()
for R objects will only work once #4208 is addressed.The text was updated successfully, but these errors were encountered: