You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's not off by a little... it's wildly different. The training set shape is (501808, 314). To be clear, the results are reproducible between runs on the local machine and the cloud but the results between the local machine and cloud are no where close.
The text was updated successfully, but these errors were encountered:
I started a gpu instance of sagemaker and tried it again and still different results. I have also tried tree_method='exact' on both and received different results
These types of reproducibility issues are quite difficult to solve. You can try setting the n_jobs to 1. Most of them are caused by floating-point errors. Floating-point addition is non-associative so in a parallel execution environment, it can present non-reproducibility behaviour.
Local Machine
XGBRegressor(base_score=0.5, booster='gbtree', colsample_bylevel=1,
colsample_bynode=1, colsample_bytree=0.942219119220772,
gamma=0.30088663356155, gpu_id=0, importance_type='gain',
interaction_constraints='', learning_rate=0.05, max_delta_step=0,
max_depth=10, min_child_weight=6, missing=nan,
monotone_constraints='()', n_estimators=250, n_jobs=-1,
num_parallel_tree=1, random_state=0, reg_alpha=3.54578021703862,
reg_lambda=0.426143991951751, scale_pos_weight=1,
subsample=0.946270611429848, tree_method='gpu_hist',
validate_parameters=1, verbosity=3)
Cloud
XGBRegressor(base_score=0.5, booster='gbtree', colsample_bylevel=1,
colsample_bynode=1, colsample_bytree=0.942219119220772,
gamma=0.30088663356155, gpu_id=0, importance_type='gain',
interaction_constraints='', learning_rate=0.05, max_delta_step=0,
max_depth=10, min_child_weight=6, missing=nan,
monotone_constraints='()', n_estimators=250, n_jobs=-1,
num_parallel_tree=1, random_state=0, reg_alpha=3.54578021703862,
reg_lambda=0.426143991951751, scale_pos_weight=1,
subsample=0.946270611429848, tree_method='hist',
validate_parameters=1, verbosity=3)
It's not off by a little... it's wildly different. The training set shape is (501808, 314). To be clear, the results are reproducible between runs on the local machine and the cloud but the results between the local machine and cloud are no where close.
The text was updated successfully, but these errors were encountered: