Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix missing_label_indices label matching & posterior lengths #342

Closed

Conversation

emilyachang
Copy link
Contributor

Reference issue

Follow-up to fix in #341

Type of change

Bugfix

What does this implement/fix?

Made a few more changes to...

  1. Fix the matching of self.classes for finding the missing_class_labels (changing y to the indices at the beginning results in it matching the classes to the indices and not the label)
  2. Make sure the leaf posteriors and uniform posteriors are the same length

Additional information

My bad for not catching this before submitting the first PR!

@emilyachang emilyachang changed the title Missing label indices fix Fix missing_label_indices label matching & posterior lengths Oct 24, 2020
Copy link
Collaborator

@levinwil levinwil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for? I thought everything was fine before. You should NOT change the length of that uniform posterior - if will cause bugs later down the line. Please do not change the length of the uniform posterior from what it was before.

All your other changes seemingly have no effect on anything and only makes things more complicated.

What were the motivations behind these changes?

@emilyachang
Copy link
Contributor Author

Regarding the change to y, with the previous change, "if label not in np.unique(y)" would be comparing the index to the label in classes (i.e. for one missing label, say 94, it would be comparing it to the index 4 and therefore appending all the values into missing_label_indices_), so if there was one missing label, missing_label_indices_ would have length 19 instead of 10.

When running the recruitment experiment, the length of the leaf_to_posterior_ and the length of the uniform_posterior_ were different for some of the x in X for voters.predict_proba, causing votes_per_example to not have columns of the same length and therefore when adding in missing_label_indices_, there was an error inserting along axis=1.

Let me know if there's a smarter way to fix there issues, but this was what I did to get my updated experiment file running.

@levinwil
Copy link
Collaborator

I agree with your first paragraph.

With respect to your second paragraph, these were fixed in 'spiral' branch that has yet to be approved (since it did not pass one of the tests). Could you please apply the changes from your first paragraph to the voters.py file on the spiral branch and then commit?

@levinwil
Copy link
Collaborator

Also, could you please apply these changes to the KNN voter as well?

@levinwil
Copy link
Collaborator

levinwil commented Oct 28, 2020

Please see the above comment. You should be simply applying a simple change to https://github.com/neurodata/ProgLearn/blob/spiral/proglearn/voters.py#L68 - just update lines 68/70 and 239/241. Those should be the only updated lines, update to change the loops to be over an enumerate, and appending idx instead of label. Do not change the length of the uniform posterior from the current value in this file.

Those commits will affect the current open PR for that branch, so this PR can and should be closed.

Copy link
Collaborator

@levinwil levinwil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see above comment for requested changes.

@emilyachang
Copy link
Contributor Author

I thought I reverted the change to the uniform posterior with the latest commit?

Should I not be fixing the return for the KNN voter .predict as I did in the previous PR for this issue?

@emilyachang
Copy link
Contributor Author

Should I just create a new branch off of spiral and apply the changes instead?

@levinwil
Copy link
Collaborator

You did revert the change to uniform posterior, but that change was unnecessary since it was correct before - uniform posterior should be of length num_fit_classes.

Your previous change to the KNNClassificationVoter was correct. Please do change the return statement.

No, please apply these changes directly to the spiral branch and close this PR. Just pull from spiral and then commit directly to it.

@levinwil
Copy link
Collaborator

@emilyachang We just merged the spiral branch into staging. Please make the above mentioned changes to the staging branch.

@emilyachang emilyachang deleted the missing_label_indices-fix branch December 16, 2020 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants