Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark with full data: Figure 1 notebook #328

Merged
merged 12 commits into from
Oct 21, 2020

Conversation

EYezerets
Copy link
Collaborator

@EYezerets EYezerets commented Oct 19, 2020

Reference issue

Addresses issue 321
: Run UF figure 1 tutorial with real parameters
and makes progress toward issue 62: re-write all the UF notebooks called ProgL repo functions.

Type of change

Documentation, Bug fix

What does this implement/fix?

This was run on AWS with the full data, and the figure is updated. Now, the tutorial for Figure 1 created by @bstraus1 shows the figure run with the test parameters, while this notebook in benchmarks/uf_posterior_visualization shows Figure 1 run using UncertaintyForest() with the real parameters. This also fixes a bug in forest.py (self.max_depth is now used), and allows UncertaintyForest() to take tree_construction_proportion as an argument (set to 0.4 in the original uncertainty forest paper)

Additional information

@levinwil
Copy link
Collaborator

  1. Please don't save fig1.pdf - it's already in the notebook itself
  2. You don't need the functions folder - these functions are already defined in docs/tutorials/functions so please just import from there so we don't have duplicate code in the same repo and remove ProgLearn/benchmarks/uf_posterior_visualization/functions/ entirely. You don't need another functions folder defining the same functions.

Copy link
Collaborator

@levinwil levinwil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EYezerets please see above comments

@bstraus1
Copy link
Contributor

Will you also reference #62 in this issue description? It is the umbrella issue which included #321.

@levinwil levinwil linked an issue Oct 19, 2020 that may be closed by this pull request
@levinwil
Copy link
Collaborator

Also, the hyperparams defining UF in this notebook do not match the original uf hyperparams.

This is the original notebook used to produce figure 1 in the UF paper: https://github.com/neurodata/uncertainty-forest/blob/master/figs/fig1/figure-1.ipynb

Please note that the fraction of the data used to generate tree structure was .4. In ProgLearn, we use .67.

Please add a parameter to the UncertaintyForest initializer (https://github.com/neurodata/ProgLearn/blob/main/proglearn/forest.py#L218) and call it "tree_construction_proportion." Please add that parameter to the initialization docstring for UF. Then feed that parameter into the LifelongClassificationForest as the default_tree_construction_proportion on this line (https://github.com/neurodata/ProgLearn/blob/main/proglearn/forest.py#L237). Then PR that feature addition into staging.

Then rerun the notebook and set tree_construction_proportion = .4 in the notebook when initializing UF. After you make the updates listed a few comments above, and have done the instructions described here, make a PR (separate from the feature addition PR) for the rerun notebook into staging.

@EYezerets
Copy link
Collaborator Author

@levinwil Thanks for all the guidance! If we're changing tree_construction_proportion, do we want to go ahead and also set the finite sample correction to kappa=3?
I believe that Fig1 saves as a pdf from the plotting function that's in the functions folder that @bstraus1 's tutorial also calls (please correct me if I'm wrong). Do you want me to just delete the pdf before I commit, or change the function's code so it doesn't output a pdf?
Looks like these are some significant changes. Are they relevant to @bstraus1 's tutorial notebook as well, or do we not care since it just needs to run?

@levinwil
Copy link
Collaborator

  1. We don’t currently have the capability to set kappa.
  2. Please both delete the PDF and change the function so that it does NOT save further PDF’s
  3. Don’t make these changes to the Ben’s tutorial notebook

Copy link
Collaborator

@levinwil levinwil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EYezerets The Travis checks did not pass. Please black format your files. There are instructions on how to do so in the contribution guidelines.

@EYezerets
Copy link
Collaborator Author

Hi @levinwil, sorry I didn't see your comment about black formatting. I have not used this, since I was working with the uncertainty-forest repository before. Does it mean I need to pip install these two things in my virtual environment and then rerun the code? (I found this: https://travis-ci.org/github/neurodata/ProgLearn/builds/737697602/config and this: https://bitsandbrains.io/2020/10/05/pr-checklist)

pip install -U pytest pytest-cov codecov
pip install black

@levinwil
Copy link
Collaborator

levinwil commented Oct 21, 2020

There are instructions on how to black format in the contribution guidelines: http://proglearn.neurodata.io/contributing.html

You simply run:
pip install black
black <file/folder name>

In your case, you should insert the folder ‘proglearn’, so the second command will be ‘black proglearn’ (the source code directory, NOT the overall repository)

Please note that you should ONLY run black formatting on source code in the proglearn directory. You should NOT black format any Jupyter notebooks.

@EYezerets
Copy link
Collaborator Author

Ah! I see OK thank you! That makes more sense. Will do

@levinwil levinwil merged commit f844065 into neurodata:staging Oct 21, 2020
EYezerets added a commit to EYezerets/ProgLearn that referenced this pull request Oct 21, 2020
Merge pull request neurodata#328 from EYezerets/staging
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

re-write all the UF notebooks called ProgL repo functions
3 participants