Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework MAP and Pairwise for LTR. #9075

Merged
merged 3 commits into from
Apr 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion R-package/src/Makevars.in
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@ OBJECTS= \
$(PKGROOT)/src/objective/objective.o \
$(PKGROOT)/src/objective/regression_obj.o \
$(PKGROOT)/src/objective/multiclass_obj.o \
$(PKGROOT)/src/objective/rank_obj.o \
$(PKGROOT)/src/objective/lambdarank_obj.o \
$(PKGROOT)/src/objective/hinge.o \
$(PKGROOT)/src/objective/aft_obj.o \
Expand Down
1 change: 0 additions & 1 deletion R-package/src/Makevars.win
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@ OBJECTS= \
$(PKGROOT)/src/objective/objective.o \
$(PKGROOT)/src/objective/regression_obj.o \
$(PKGROOT)/src/objective/multiclass_obj.o \
$(PKGROOT)/src/objective/rank_obj.o \
$(PKGROOT)/src/objective/lambdarank_obj.o \
$(PKGROOT)/src/objective/hinge.o \
$(PKGROOT)/src/objective/aft_obj.o \
Expand Down
18 changes: 14 additions & 4 deletions doc/model.schema
Original file line number Diff line number Diff line change
Expand Up @@ -219,6 +219,16 @@
"num_pairsample": { "type": "string" },
"fix_list_weight": { "type": "string" }
}
},
"lambdarank_param": {
"type": "object",
"properties": {
"lambdarank_num_pair_per_sample": { "type": "string" },
"lambdarank_pair_method": { "type": "string" },
"lambdarank_unbiased": {"type": "string" },
"lambdarank_bias_norm": {"type": "string" },
"ndcg_exp_gain": {"type": "string"}
}
}
},
"type": "object",
Expand Down Expand Up @@ -477,22 +487,22 @@
"type": "object",
"properties": {
"name": { "const": "rank:pairwise" },
"lambda_rank_param": { "$ref": "#/definitions/lambda_rank_param"}
"lambda_rank_param": { "$ref": "#/definitions/lambdarank_param"}
},
"required": [
"name",
"lambda_rank_param"
"lambdarank_param"
]
},
{
"type": "object",
"properties": {
"name": { "const": "rank:ndcg" },
"lambda_rank_param": { "$ref": "#/definitions/lambda_rank_param"}
"lambda_rank_param": { "$ref": "#/definitions/lambdarank_param"}
},
"required": [
"name",
"lambda_rank_param"
"lambdarank_param"
]
},
{
Expand Down
43 changes: 37 additions & 6 deletions doc/parameter.rst
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ Parameters for Tree Booster
.. note:: This parameter is working-in-progress.

- The strategy used for training multi-target models, including multi-target regression
and multi-class classification. See :doc:`/tutorials/multioutput` for more information.
and multi-class classification. See :doc:`/tutorials/multioutput` for more information.

- ``one_output_per_tree``: One model for each target.
- ``multi_output_tree``: Use multi-target trees.
Expand Down Expand Up @@ -380,9 +380,9 @@ Specify the learning task and the corresponding learning objective. The objectiv
See :doc:`/tutorials/aft_survival_analysis` for details.
- ``multi:softmax``: set XGBoost to do multiclass classification using the softmax objective, you also need to set num_class(number of classes)
- ``multi:softprob``: same as softmax, but output a vector of ``ndata * nclass``, which can be further reshaped to ``ndata * nclass`` matrix. The result contains predicted probability of each data point belonging to each class.
- ``rank:pairwise``: Use LambdaMART to perform pairwise ranking where the pairwise loss is minimized
- ``rank:ndcg``: Use LambdaMART to perform list-wise ranking where `Normalized Discounted Cumulative Gain (NDCG) <http://en.wikipedia.org/wiki/NDCG>`_ is maximized
- ``rank:map``: Use LambdaMART to perform list-wise ranking where `Mean Average Precision (MAP) <http://en.wikipedia.org/wiki/Mean_average_precision#Mean_average_precision>`_ is maximized
- ``rank:ndcg``: Use LambdaMART to perform pair-wise ranking where `Normalized Discounted Cumulative Gain (NDCG) <http://en.wikipedia.org/wiki/NDCG>`_ is maximized. This objective supports position debiasing for click data.
- ``rank:map``: Use LambdaMART to perform pair-wise ranking where `Mean Average Precision (MAP) <http://en.wikipedia.org/wiki/Mean_average_precision#Mean_average_precision>`_ is maximized
- ``rank:pairwise``: Use LambdaRank to perform pair-wise ranking using the `ranknet` objective.
- ``reg:gamma``: gamma regression with log-link. Output is a mean of gamma distribution. It might be useful, e.g., for modeling insurance claims severity, or for any outcome that might be `gamma-distributed <https://en.wikipedia.org/wiki/Gamma_distribution#Occurrence_and_applications>`_.
- ``reg:tweedie``: Tweedie regression with log-link. It might be useful, e.g., for modeling total loss in insurance, or for any outcome that might be `Tweedie-distributed <https://en.wikipedia.org/wiki/Tweedie_distribution#Occurrence_and_applications>`_.

Expand All @@ -395,8 +395,9 @@ Specify the learning task and the corresponding learning objective. The objectiv

* ``eval_metric`` [default according to objective]

- Evaluation metrics for validation data, a default metric will be assigned according to objective (rmse for regression, and logloss for classification, mean average precision for ranking)
- User can add multiple evaluation metrics. Python users: remember to pass the metrics in as list of parameters pairs instead of map, so that latter ``eval_metric`` won't override previous one
- Evaluation metrics for validation data, a default metric will be assigned according to objective (rmse for regression, and logloss for classification, `mean average precision` for ``rank:map``, etc.)
- User can add multiple evaluation metrics. Python users: remember to pass the metrics in as list of parameters pairs instead of map, so that latter ``eval_metric`` won't override previous ones

- The choices are listed below:

- ``rmse``: `root mean square error <http://en.wikipedia.org/wiki/Root_mean_square_error>`_
Expand Down Expand Up @@ -480,6 +481,36 @@ Parameter for using AFT Survival Loss (``survival:aft``) and Negative Log Likeli

* ``aft_loss_distribution``: Probability Density Function, ``normal``, ``logistic``, or ``extreme``.

.. _ltr-param:

Parameters for learning to rank (``rank:ndcg``, ``rank:map``, ``rank:pairwise``)
================================================================================

These are parameters specific to learning to rank task. See :doc:`Learning to Rank </tutorials/learning_to_rank>` for an in-depth explanation.

* ``lambdarank_pair_method`` [default = ``mean``]

How to construct pairs for pair-wise learning.

- ``mean``: Sample ``lambdarank_num_pair_per_sample`` pairs for each document in the query list.
- ``topk``: Focus on top-``lambdarank_num_pair_per_sample`` documents. Construct :math:`|query|` pairs for each document at the top-``lambdarank_num_pair_per_sample`` ranked by the model.

* ``lambdarank_num_pair_per_sample`` [range = :math:`[1, \infty]`]

It specifies the number of pairs sampled for each document when pair method is ``mean``, or the truncation level for queries when the pair method is ``topk``. For example, to train with ``ndcg@6``, set ``lambdarank_num_pair_per_sample`` to :math:`6` and ``lambdarank_pair_method`` to ``topk``.

* ``lambdarank_unbiased`` [default = ``false``]

Specify whether do we need to debias input click data.

* ``lambdarank_bias_norm`` [default = 2.0]

:math:`L_p` normalization for position debiasing, default is :math:`L_2`. Only relevant when ``lambdarank_unbiased`` is set to true.

* ``ndcg_exp_gain`` [default = ``true``]

Whether we should use exponential gain function for ``NDCG``. There are two forms of gain function for ``NDCG``, one is using relevance value directly while the other is using :math:`2^{rel} - 1` to emphasize on retrieving relevant documents. When ``ndcg_exp_gain`` is true (the default), relevance degree cannot be greater than 31.

***********************
Command Line Parameters
***********************
Expand Down
7 changes: 5 additions & 2 deletions python-package/xgboost/testing/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -431,8 +431,11 @@ def make_ltr(
"""Make a dataset for testing LTR."""
rng = np.random.default_rng(1994)
X = rng.normal(0, 1.0, size=n_samples * n_features).reshape(n_samples, n_features)
y = rng.integers(0, max_rel, size=n_samples)
qid = rng.integers(0, n_query_groups, size=n_samples)
y = np.sum(X, axis=1)
y -= y.min()
y = np.round(y / y.max() * max_rel).astype(np.int32)

qid = rng.integers(0, n_query_groups, size=n_samples, dtype=np.int32)
w = rng.normal(0, 1.0, size=n_query_groups)
w -= np.min(w)
w /= np.max(w)
Expand Down
1 change: 0 additions & 1 deletion src/metric/rank_metric.cc
Original file line number Diff line number Diff line change
Expand Up @@ -501,7 +501,6 @@ class EvalMAPScore : public EvalRankWithCache<ltr::MAPCache> {
auto rank_idx = p_cache->SortedIdx(ctx_, predt.ConstHostSpan());

common::ParallelFor(p_cache->Groups(), ctx_->Threads(), [&](auto g) {
auto g_predt = h_predt.Slice(linalg::Range(gptr[g], gptr[g + 1]));
auto g_label = h_label.Slice(linalg::Range(gptr[g], gptr[g + 1]));
auto g_rank = rank_idx.subspan(gptr[g]);

Expand Down
193 changes: 193 additions & 0 deletions src/objective/lambdarank_obj.cc
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ void LambdaRankUpdatePositionBias(Context const* ctx, linalg::VectorView<double
lj(i) += g_lj(i);
}
}

// The ti+ is not guaranteed to decrease since it depends on the |\delta Z|
//
// The update normalizes the ti+ to make ti+(0) equal to 1, which breaks the probability
Expand Down Expand Up @@ -432,9 +433,201 @@ void LambdaRankUpdatePositionBias(Context const*, linalg::VectorView<double cons
#endif // !defined(XGBOOST_USE_CUDA)
} // namespace cuda_impl

namespace cpu_impl {
void MAPStat(Context const* ctx, linalg::VectorView<float const> label,
common::Span<std::size_t const> rank_idx, std::shared_ptr<ltr::MAPCache> p_cache) {
auto h_n_rel = p_cache->NumRelevant(ctx);
auto gptr = p_cache->DataGroupPtr(ctx);

CHECK_EQ(h_n_rel.size(), gptr.back());
CHECK_EQ(h_n_rel.size(), label.Size());

auto h_acc = p_cache->Acc(ctx);

common::ParallelFor(p_cache->Groups(), ctx->Threads(), [&](auto g) {
auto cnt = gptr[g + 1] - gptr[g];
auto g_n_rel = h_n_rel.subspan(gptr[g], cnt);
auto g_rank = rank_idx.subspan(gptr[g], cnt);
auto g_label = label.Slice(linalg::Range(gptr[g], gptr[g + 1]));

// The number of relevant documents at each position
g_n_rel[0] = g_label(g_rank[0]);
for (std::size_t k = 1; k < g_rank.size(); ++k) {
g_n_rel[k] = g_n_rel[k - 1] + g_label(g_rank[k]);
}

// \sum l_k/k
auto g_acc = h_acc.subspan(gptr[g], cnt);
g_acc[0] = g_label(g_rank[0]) / 1.0;

for (std::size_t k = 1; k < g_rank.size(); ++k) {
g_acc[k] = g_acc[k - 1] + (g_label(g_rank[k]) / static_cast<double>(k + 1));
}
});
}
} // namespace cpu_impl

class LambdaRankMAP : public LambdaRankObj<LambdaRankMAP, ltr::MAPCache> {
public:
void GetGradientImpl(std::int32_t iter, const HostDeviceVector<float>& predt,
const MetaInfo& info, HostDeviceVector<GradientPair>* out_gpair) {
CHECK(param_.ndcg_exp_gain) << "NDCG gain can not be set for the MAP objective.";
if (ctx_->IsCUDA()) {
return cuda_impl::LambdaRankGetGradientMAP(
ctx_, iter, predt, info, GetCache(), ti_plus_.View(ctx_->gpu_id),
tj_minus_.View(ctx_->gpu_id), li_full_.View(ctx_->gpu_id), lj_full_.View(ctx_->gpu_id),
out_gpair);
}

auto gptr = p_cache_->DataGroupPtr(ctx_).data();
bst_group_t n_groups = p_cache_->Groups();

out_gpair->Resize(info.num_row_);
auto h_gpair = out_gpair->HostSpan();
auto h_label = info.labels.HostView().Slice(linalg::All(), 0);
auto h_predt = predt.ConstHostSpan();
auto rank_idx = p_cache_->SortedIdx(ctx_, h_predt);
auto h_weight = common::MakeOptionalWeights(ctx_, info.weights_);

auto make_range = [&](bst_group_t g) { return linalg::Range(gptr[g], gptr[g + 1]); };

cpu_impl::MAPStat(ctx_, h_label, rank_idx, GetCache());
auto n_rel = GetCache()->NumRelevant(ctx_);
auto acc = GetCache()->Acc(ctx_);

auto delta_map = [&](auto y_high, auto y_low, std::size_t rank_high, std::size_t rank_low,
bst_group_t g) {
if (rank_high > rank_low) {
std::swap(rank_high, rank_low);
std::swap(y_high, y_low);
}
auto cnt = gptr[g + 1] - gptr[g];
// In a hot loop
auto g_n_rel = common::Span<double const>{n_rel.data() + gptr[g], cnt};
auto g_acc = common::Span<double const>{acc.data() + gptr[g], cnt};
auto d = DeltaMAP(y_high, y_low, rank_high, rank_low, g_n_rel, g_acc);
return d;
};
using D = decltype(delta_map);

common::ParallelFor(n_groups, ctx_->Threads(), [&](auto g) {
auto cnt = gptr[g + 1] - gptr[g];
auto w = h_weight[g];
auto g_predt = h_predt.subspan(gptr[g], cnt);
auto g_gpair = h_gpair.subspan(gptr[g], cnt);
auto g_label = h_label.Slice(make_range(g));
auto g_rank = rank_idx.subspan(gptr[g], cnt);

auto args = std::make_tuple(this, iter, g_predt, g_label, w, g_rank, g, delta_map, g_gpair);

if (param_.lambdarank_unbiased) {
std::apply(&LambdaRankMAP::CalcLambdaForGroup<true, D>, args);
} else {
std::apply(&LambdaRankMAP::CalcLambdaForGroup<false, D>, args);
}
});
}
static char const* Name() { return "rank:map"; }
[[nodiscard]] const char* DefaultEvalMetric() const override {
return this->RankEvalMetric("map");
}
};

#if !defined(XGBOOST_USE_CUDA)
namespace cuda_impl {
void MAPStat(Context const*, MetaInfo const&, common::Span<std::size_t const>,
std::shared_ptr<ltr::MAPCache>) {
common::AssertGPUSupport();
}

void LambdaRankGetGradientMAP(Context const*, std::int32_t, HostDeviceVector<float> const&,
const MetaInfo&, std::shared_ptr<ltr::MAPCache>,
linalg::VectorView<double const>, // input bias ratio
linalg::VectorView<double const>, // input bias ratio
linalg::VectorView<double>, linalg::VectorView<double>,
HostDeviceVector<GradientPair>*) {
common::AssertGPUSupport();
}
} // namespace cuda_impl
#endif // !defined(XGBOOST_USE_CUDA)

/**
* \brief The RankNet loss.
*/
class LambdaRankPairwise : public LambdaRankObj<LambdaRankPairwise, ltr::RankingCache> {
public:
void GetGradientImpl(std::int32_t iter, const HostDeviceVector<float>& predt,
const MetaInfo& info, HostDeviceVector<GradientPair>* out_gpair) {
CHECK(param_.ndcg_exp_gain) << "NDCG gain can not be set for the pairwise objective.";
if (ctx_->IsCUDA()) {
return cuda_impl::LambdaRankGetGradientPairwise(
ctx_, iter, predt, info, GetCache(), ti_plus_.View(ctx_->gpu_id),
tj_minus_.View(ctx_->gpu_id), li_full_.View(ctx_->gpu_id), lj_full_.View(ctx_->gpu_id),
out_gpair);
}

auto gptr = p_cache_->DataGroupPtr(ctx_);
bst_group_t n_groups = p_cache_->Groups();

out_gpair->Resize(info.num_row_);
auto h_gpair = out_gpair->HostSpan();
auto h_label = info.labels.HostView().Slice(linalg::All(), 0);
auto h_predt = predt.ConstHostSpan();
auto h_weight = common::MakeOptionalWeights(ctx_, info.weights_);

auto make_range = [&](bst_group_t g) { return linalg::Range(gptr[g], gptr[g + 1]); };
auto rank_idx = p_cache_->SortedIdx(ctx_, h_predt);

auto delta = [](auto...) { return 1.0; };
using D = decltype(delta);

common::ParallelFor(n_groups, ctx_->Threads(), [&](auto g) {
auto cnt = gptr[g + 1] - gptr[g];
auto w = h_weight[g];
auto g_predt = h_predt.subspan(gptr[g], cnt);
auto g_gpair = h_gpair.subspan(gptr[g], cnt);
auto g_label = h_label.Slice(make_range(g));
auto g_rank = rank_idx.subspan(gptr[g], cnt);

auto args = std::make_tuple(this, iter, g_predt, g_label, w, g_rank, g, delta, g_gpair);
if (param_.lambdarank_unbiased) {
std::apply(&LambdaRankPairwise::CalcLambdaForGroup<true, D>, args);
} else {
std::apply(&LambdaRankPairwise::CalcLambdaForGroup<false, D>, args);
}
});
}

static char const* Name() { return "rank:pairwise"; }
[[nodiscard]] const char* DefaultEvalMetric() const override {
return this->RankEvalMetric("ndcg");
}
};

#if !defined(XGBOOST_USE_CUDA)
namespace cuda_impl {
void LambdaRankGetGradientPairwise(Context const*, std::int32_t, HostDeviceVector<float> const&,
const MetaInfo&, std::shared_ptr<ltr::RankingCache>,
linalg::VectorView<double const>, // input bias ratio
linalg::VectorView<double const>, // input bias ratio
linalg::VectorView<double>, linalg::VectorView<double>,
HostDeviceVector<GradientPair>*) {
common::AssertGPUSupport();
}
} // namespace cuda_impl
#endif // !defined(XGBOOST_USE_CUDA)

XGBOOST_REGISTER_OBJECTIVE(LambdaRankNDCG, LambdaRankNDCG::Name())
.describe("LambdaRank with NDCG loss as objective")
.set_body([]() { return new LambdaRankNDCG{}; });

XGBOOST_REGISTER_OBJECTIVE(LambdaRankPairwise, LambdaRankPairwise::Name())
.describe("LambdaRank with RankNet loss as objective")
.set_body([]() { return new LambdaRankPairwise{}; });

XGBOOST_REGISTER_OBJECTIVE(LambdaRankMAP, LambdaRankMAP::Name())
.describe("LambdaRank with MAP loss as objective.")
.set_body([]() { return new LambdaRankMAP{}; });

DMLC_REGISTRY_FILE_TAG(lambdarank_obj);
} // namespace xgboost::obj
Loading