diff --git a/docs/Parameters.rst b/docs/Parameters.rst index c3859e76224f..c4196cca7a65 100644 --- a/docs/Parameters.rst +++ b/docs/Parameters.rst @@ -594,7 +594,7 @@ Learning Control Parameters - larger values give stronger regularization - - the weight of each node is ``(n / path_smooth) * w + w_p / (n / path_smooth + 1)``, where ``n`` is the number of samples in the node, ``w`` is the optimal node weight to minimise the loss (approximately ``-sum_gradients / sum_hessians``), and ``w_p`` is the weight of the parent node + - the weight of each node is ``w * (n / path_smooth) / (n / path_smooth + 1) + w_p / (n / path_smooth + 1)``, where ``n`` is the number of samples in the node, ``w`` is the optimal node weight to minimise the loss (approximately ``-sum_gradients / sum_hessians``), and ``w_p`` is the weight of the parent node - note that the parent output ``w_p`` itself has smoothing applied, unless it is the root node, so that the smoothing effect accumulates with the tree depth diff --git a/include/LightGBM/config.h b/include/LightGBM/config.h index a7c254e5aa0d..900a43301e1c 100644 --- a/include/LightGBM/config.h +++ b/include/LightGBM/config.h @@ -531,7 +531,7 @@ struct Config { // desc = if set to zero, no smoothing is applied // desc = if ``path_smooth > 0`` then ``min_data_in_leaf`` must be at least ``2`` // desc = larger values give stronger regularization - // descl2 = the weight of each node is ``(n / path_smooth) * w + w_p / (n / path_smooth + 1)``, where ``n`` is the number of samples in the node, ``w`` is the optimal node weight to minimise the loss (approximately ``-sum_gradients / sum_hessians``), and ``w_p`` is the weight of the parent node + // descl2 = the weight of each node is ``w * (n / path_smooth) / (n / path_smooth + 1) + w_p / (n / path_smooth + 1)``, where ``n`` is the number of samples in the node, ``w`` is the optimal node weight to minimise the loss (approximately ``-sum_gradients / sum_hessians``), and ``w_p`` is the weight of the parent node // descl2 = note that the parent output ``w_p`` itself has smoothing applied, unless it is the root node, so that the smoothing effect accumulates with the tree depth double path_smooth = 0;