Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add random forest in VFL #523

Merged
merged 6 commits into from
Feb 16, 2023
Merged

Add random forest in VFL #523

merged 6 commits into from
Feb 16, 2023

Conversation

xieyxclack
Copy link
Collaborator

@xieyxclack xieyxclack commented Feb 14, 2023

This pr is adapted from #501 , and many thanks for contributions from @qbc2016 !

The modifications in this pr includes:

  • Random forest for vfl
    • A new trainer: random_forest_trainer
    • A new model: random forest (multiple decisions)
    • VerticalDataSampler: to support sampling a batch of data and a subset of features for training
    • Feature protection methods for random forest
  • vertical.data_size_for_debug: only use a subset for running vfl when debugging
  • Modify the losses in vfl: to support both mean and sum operations when calculating loss/acc

How to apply random forest:

  • set vertical.algo='rf' and model.type = 'random_forest' @rayrayraykk

Todo:

  • Docs

return np.sum(left_child_indicator) / total_num * left_gini + sum(
right_child_indicator) / total_num * right_gini

def cal_sse(self, split_idx, y, indicator):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, "sse" means "Sum of Square Error of two subtrees", should we rename it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sse -> sum_of_square_mean_err

Copy link
Collaborator

@qbc2016 qbc2016 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@xieyxclack xieyxclack merged commit 320a225 into alibaba:master Feb 16, 2023
@xieyxclack xieyxclack mentioned this pull request Feb 17, 2023
@xieyxclack xieyxclack deleted the dev/rf branch April 3, 2023 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature New feature Tree
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants