-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Roadmap] Multiple outputs. #9043
Comments
Hi, great work on the initial multitarget implementation! Given the roadmap when can we expect GPU support for multi output regression? When this support is added will |
Hi @CarloLepelaars ,
|
Hi, very nice work! I am wondering how SHAP should be used for multi-output models, e.g. how to explain links between the Ys, and how to interpret the effects of Xs - e.g., which Xs display common effects across the Ys, and which Xs display differential effects. Do you know a good example of using SHAP for a multi-output model? |
For model per target, it's the same as single target. As for vector leaf, I haven't looked into it yet, but no significant difference on top of my mind. |
I am currently toying with multitargets approach ... I have a hard time defining a custom metric (haven't tried custom loss). Preds seems to be of size (len(y) x len(targets)) while y_true is of shape (len(y), len(targets)), I have managed to handle this internally to my metric to return one value. But now I have an error about an output being a tuple instead of a number. |
Hi. Did anybody train the multiple outputs XGBoost model on Mac arm64 machine? On recent stable version I have got error: On latest nightly version xgboost-2.1.0.dev0+a7226c02223246be78a59c3a4e8c32d1c68c1ff9 - I have managed load CPU, but it was no feedback on terminal window. |
Is the vector-leaf-based multi-output model still work in progress ? Also what research paper based on which splitting mechanism for decision trees is working for this ? @trivialfis |
yes, it's still working in progress. |
Hi @trivialfis, I'm currently working on some models using XGBoostLSS which as far as I understand is based on the multi-output feature of XGBoost. I wonder how monotonic constraints are considered in the multi-ouput case ? It seems constraints are shared among trees built for each target, could you confirm ? Thanks for your work on this feature ! |
Hello @trivialfis, I'm working on a multi label binary classification problem (I have three targets) and all my targets are highly imbalanced. But I don't seem to understand how I can leverage Also, will I be able to access functionalities such as shapley values computation? |
Hi @trivialfis , I have been using the muli-ouput-tree feature on a couple of real world datasets across tasks of regression and classification and also with custom loss functions, the results are great. An observation: The JSON model file of the multi-output-tree model doesn't include Thanks for the great work on this! Looking forward for the successful implementation of the roadmap. |
It's still working in progress. We still need to add these fields.
This happens only if the output targets are correlated, like softmax. For multi-target regression, we assume targets are independent in the gradient calculation. The potential dependencies are captured in the output tree instead of the gradient calculation. |
Hi @trivialfis Check failed: gpair->Shape(1) == 1 (10 vs. 1): support for multi-target tree is not yet implemented. (Which is likely related to this line) Why can't we perform the same procedure as single-output trees? In my specific scenario, I aim to adjust the leaf values using the updated gradient statistics from a different dataset without changing thresholds. Due to the small sample size, each iteration requires me to update the previous trees for each new batch of samples based on their gradients. |
I'm currently trying to work on some other priorities, will get back to this. |
Since the XGBoost 1.6, we have been working on having multi-output support for the tree model. In 2.0, we will have the initial implementation for the vector-leaf-based multi-output model. This issue is a tracker for future development and for related discussion. The original feature request is here: #2087 . The related features are for vector leaf instead of general multi-output.
Feel free to share your suggestions or make related feature requests in the comments.
Implementation Optimization
Algorithmic Optimization
We are still looking for potential algorithmic optimization for vector-leaf and here's the pool of candidates. We need to survey all available options. Feel free to share if you have ideas or paper recommendations.
GPU Implementation
Documentation
Multi-task
Features
Learning to rank
We can have a ranking model to consider multiple criteria. This might require multi-task to be supported.
Quantile regression
Distributed
hist
#9171)Binding
HPO
Other extensions
Applications
Benchmarks
The text was updated successfully, but these errors were encountered: