Unable to reproduce results on local machine vs cloud #6905

lingzhou125 · 2021-04-25T20:10:43Z

Local Machine
XGBRegressor(base_score=0.5, booster='gbtree', colsample_bylevel=1,
colsample_bynode=1, colsample_bytree=0.942219119220772,
gamma=0.30088663356155, gpu_id=0, importance_type='gain',
interaction_constraints='', learning_rate=0.05, max_delta_step=0,
max_depth=10, min_child_weight=6, missing=nan,
monotone_constraints='()', n_estimators=250, n_jobs=-1,
num_parallel_tree=1, random_state=0, reg_alpha=3.54578021703862,
reg_lambda=0.426143991951751, scale_pos_weight=1,
subsample=0.946270611429848, tree_method='gpu_hist',
validate_parameters=1, verbosity=3)

Cloud
XGBRegressor(base_score=0.5, booster='gbtree', colsample_bylevel=1,
colsample_bynode=1, colsample_bytree=0.942219119220772,
gamma=0.30088663356155, gpu_id=0, importance_type='gain',
interaction_constraints='', learning_rate=0.05, max_delta_step=0,
max_depth=10, min_child_weight=6, missing=nan,
monotone_constraints='()', n_estimators=250, n_jobs=-1,
num_parallel_tree=1, random_state=0, reg_alpha=3.54578021703862,
reg_lambda=0.426143991951751, scale_pos_weight=1,
subsample=0.946270611429848, tree_method='hist',
validate_parameters=1, verbosity=3)

It's not off by a little... it's wildly different. The training set shape is (501808, 314). To be clear, the results are reproducible between runs on the local machine and the cloud but the results between the local machine and cloud are no where close.

trivialfis · 2021-04-25T22:22:59Z

You have specified different tree methods.

lingzhou125 · 2021-04-25T22:46:24Z

I started a gpu instance of sagemaker and tried it again and still different results. I have also tried tree_method='exact' on both and received different results

trivialfis · 2021-04-28T15:10:22Z

These types of reproducibility issues are quite difficult to solve. You can try setting the n_jobs to 1. Most of them are caused by floating-point errors. Floating-point addition is non-associative so in a parallel execution environment, it can present non-reproducibility behaviour.

trivialfis closed this as completed May 13, 2021

trivialfis mentioned this issue Mar 13, 2023

[doc][dask] Note on reproducible result. [skip ci] #8903

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to reproduce results on local machine vs cloud #6905

Unable to reproduce results on local machine vs cloud #6905

lingzhou125 commented Apr 25, 2021

trivialfis commented Apr 25, 2021

lingzhou125 commented Apr 25, 2021

trivialfis commented Apr 28, 2021

Unable to reproduce results on local machine vs cloud #6905

Unable to reproduce results on local machine vs cloud #6905

Comments

lingzhou125 commented Apr 25, 2021

trivialfis commented Apr 25, 2021

lingzhou125 commented Apr 25, 2021

trivialfis commented Apr 28, 2021