Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add tensorboard logger test and fix callback typo #560

Merged
merged 9 commits into from
Dec 3, 2023
Merged

Conversation

bw4sz
Copy link
Collaborator

@bw4sz bw4sz commented Dec 1, 2023

Closes the #559 and #558 , there was a typo in callbacks only for the create_trainer(logger=), function. I added a test, but it really should have been caught somewhere else. This is high priority.

@bw4sz bw4sz self-assigned this Dec 1, 2023
@ethanwhite
Copy link
Member

Everything looks good. Will approve once the tests pass.

@bw4sz
Copy link
Collaborator Author

bw4sz commented Dec 1, 2023

I think there is still something hapenning with the tensorboard logger and the trainer object, I have a meeting until 1pm, but i'll have look this afternoon.

@ethanwhite
Copy link
Member

Sounds good. I'll leave this for now. Ping me when you're ready for a review.

@bw4sz
Copy link
Collaborator Author

bw4sz commented Dec 2, 2023

This turned out to be a large issue and definitely warranting of 1.3.0. Our tests did not cover the logger, we had discontinued it because of the annoyances of mocking the comet_ml experiment. This turns out to be tangential to the goal, but hid the actual problem, which was quite huge. The problem is upstream of any given logger, and with pytorch lightning trainer.predict methods.

In f71eef4, I changed the predict_file methods to user trainer.predict. The virtue of this was that it made prediction much faster. We used to have to do alot of annoying workarounds, like not run evaluation that often because that function was so slow. What we didn't see was that when using a logger, like comet_ml or tensorboard, you cannot mix trainer.predict inside of an evaluation_hook. I have documented this on pytorch lightning repo. See lots of relevant conversation.

Lightning-AI/pytorch-lightning#19101

The bad news means that training with a logger has been dead for three weeks and we didn't know.
The good news is that with every error, we find ways to make the whole API better. I have refactored the evaluate routine to include a new wrapper that directly uses the predictions made during pytorch lightnings hooks, instead of re-running them separately. This will be faster. It also allowed me to move code out of main.py and into other modules, which is a long term goal to making the API more flexible, main.py should really just have wrappers and not custom code. During this I was then able to remove quite a bit of duplicate code. This is the same reason that the typo noted in #558 was able to get through the tests. No test was running this because it related to a logger during a callback. This is corrected now.

@bw4sz
Copy link
Collaborator Author

bw4sz commented Dec 2, 2023

i'm tempted to force push this, but I suppose it can wait until Monday. Once @ethanwhite merges I will bumpversion minor and push to pypi.

Copy link
Member

@ethanwhite ethanwhite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good. My one meaningful thing is that I found the fact that predict.predict_file() doesn't work directly on a file confusing. It sent me on a couple of tangents while reviewing which might suggest it could lead to confusing maintenance. Other than potentially renaming that there are a few docstring issues to cleanup and I think this should be ready to go it.

HISTORY.rst Outdated Show resolved Hide resolved
deepforest/predict.py Outdated Show resolved Hide resolved
deepforest/predict.py Show resolved Hide resolved
deepforest/predict.py Outdated Show resolved Hide resolved
deepforest/main.py Outdated Show resolved Hide resolved
deepforest/predict.py Show resolved Hide resolved
bw4sz added 2 commits December 2, 2023 15:07
…ey are private methods. I improved the seperation of predict._predict_image_ and main.predict_image to follow the logic of other methods
@bw4sz
Copy link
Collaborator Author

bw4sz commented Dec 2, 2023

@ethanwhite I pushed the docstring changes, renamed the predict.predict_file to predict._predict_a_dataloader_ method to reflect it predicts a dataloader and separated predict._predict_image_ from main.predict_image.

Copy link
Member

@ethanwhite ethanwhite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple more small cleanups and then I'll get this merged.

deepforest/predict.py Outdated Show resolved Hide resolved
deepforest/main.py Outdated Show resolved Hide resolved
deepforest/predict.py Outdated Show resolved Hide resolved
deepforest/predict.py Outdated Show resolved Hide resolved
Copy link
Member

@ethanwhite ethanwhite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense. I like the solution.

@ethanwhite ethanwhite merged commit 8e7efa9 into main Dec 3, 2023
6 checks passed
@ethanwhite ethanwhite deleted the issue_559 branch December 3, 2023 16:32
janjatovic pushed a commit to Treeconomy/DeepForest_new that referenced this pull request Mar 26, 2024
add tensorboard logger test and fix callback typo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants