-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update GUI to allow for annotating keypoints that are not visible #27
Comments
Currently there is no way to indicate that body part is not visible when making annotations. We're still testing this feature internally (ended up being a bit more complicated than expected), but should be available soon. Hopefully before end of November. Leaving them with the default coordinate will affect training, so I would recommend just annotating as best as you can near where the body part should be (such as the closest visible body part). |
Has there been any update to this? Can you describe what challenges there are in implementing this? I'm not particularly experienced, but this is critical for me and I'd be glad to help if it's not overly complex. |
We had a student working on this as part of a larger update, who unfortunately left before they finished the update. This ended up being more complicated than I originally thought and haven't had time to finish it. I can take a look at the code next week to see how much progress they made and try to adapt it if useful. Basically the GUI class and Annotator class (which subclasses the GUI class) need to be updated with a new hotkey and logic that detects the new hotkey when pressed and sets the coordinates to NaN. Then it needs logic that detects NaNs and changes the text color (to grey or something similar) and doesn't draw the points/crosshairs or lines for the skeleton in the GUI to visually indicate that the point is not visible. |
Ok thanks please let me know! |
I implemented the changes in the pip install --update git+https://www.github.com/jgraving/deepposekit.git The hotkey is V to toggle visibility within the gui. The euclidean error and confidence scores during training may be unreliable as there is no confidence score threshold implemented for these metrics. I'd recommend turning off the skeleton/graph confidence maps during training |
Thanks for this! I verified that it works, in the annotator (but not downstream training). I'm not sure I totally understand the concern around training. As I understand, 'non visible' points now have their location set to NaN. I could imagine two ways of handling this at training time. (1), the network is trained to predict 0 confidence values across the entire image, (2) The loss for non-visible points is masked out, so that it has no contribution to training. I agree that if implemented as (1) I'm not sure what will happen during training. I had assumed that (2) was the idea here, to simply ignore elements of the loss (keypoint and associated graph components) related to the non-visible point. If implemented as (2), I don't understand why this would cause instability during training? As an aside, something that is an issue at least on windows computers, is that installing DPK with pip causes the opencv version to be changed to opencv-python-headless. I think that's the wrong version, and it causes lots of errors, and it has to be manually uninstalled and opencv-python needs to be manually installed afterwards. I know this issue has come up with imgaug package as well: |
Good to hear! Currently scenario (1) is used at training time as the network has to be optimized to predict zeros when the keypoint isn't visible so the low confidence score can be used to filter non-visible points during inference. However, the graph/skeleton drawing function hasn't been updated to deal with the NaNs, which is why I'd recommend disabling this in the There shouldn't be any instability during training with the loss function or optimizer. My comment was pointing out that if non-visible keypoints are predicted as all zeros then the predicted keypoint coordinates will basically be a random coordinate in the image (with low confidence) and the euclidean error metric shown during training may not be a reliable indicator of performance. This metric needs to be updated to only include euclidean error for keypoints above a confidence threshold (e.g. p >= 0.1). My other comment was if specific types of keypoints are only rarely visible within the training set, then the network might just learn to always predict values close to zero, which is why online hard keypoints mining loss is useful, as it detects average error for each keypoint within a training batch and dynamically reweights the top-k highest error keypoints. I'll take a look later this week to see if I can implement the changes I described. I'll also look into your comment about the dependencies. |
I think I understand your points now, except that I don't understand why making essentially random predictions for non-visible key points will impact the euclidean error? If the key point is not visible, than there is no true location for the predicted location, of any confidence level, to be compared to, right? Or will it be compared to a random location? |
I've verified that training works for non-visible points, thanks for making the change! I disabled graph prediction and observed wacky euclidean errors as you said. Something that confused me at first, so maybe helps others to understand, is that the data generator converts NaN (non-visible) coordinates to -9999, presumably because this is so far away from the image that the target confidence map will be all 0. I will try to update the euclidean error metric to exclude non-visible points. It seems like this could be done using the ground truth visibility rather than implementing a confidence threshold? |
Yes. This was done for two reasons 1) imgaug doesn't support NaN values for keypoints (or didn't at the time, but perhaps this has been updated), and 2) a large negative value was used so that even after augmentation the keypoint would be outside the coordinate range of the image in order to produce an all zero confidence map. Now that I look at the code perhaps this should be done within the
Yeah, that's a good point, you're right. I was wrong. Not sure what I was thinking before. You can use the ground truth data to select which keypoints are included in the euclidean error calculations. |
So I was going to implement this, but then realized it could be a bit tricky. There are three places changes could be implemented: (1) Logger callback on_epoch_end; (2) base_model.evaluate; (3) utils.keypoints.keypoint_errors function. (3) makes the most sense to me, but wonder what you think? Also, I noticed that there is a confidence mask option built into the logger already, so this is possible but not ideal solution I think. In any case, the problem is that we need to identify which points are not visible for each frame, but NaN values have been converted to large negative numbers. Naively, I'd say we could just identify any keypoint predictions which are negative and mask w/ those. However, is it possible that an augmentation pipeline could make some true annotation points take on negative values? If so, either we could hack it using some 'large' negative threshold, or more principled approach would be to modify base_model.evaluate to grab keypoints without conversion to NaN? |
Right, I agree. The confidence mask option was added as a quick solution, because it was just a couple lines of code. Another consideration is you want to ensure when evaluating the model that it predicts low confidence scores for points that aren't visible, but this perhaps that type of evaluation should be separate from the Euclidean error score.
Augmentation isn't applied during validation, so the non-visible keypoints in the validation set will always be set to a static value and should be easy to detect. https://github.com/jgraving/DeepPoseKit/blob/master/deepposekit/io/BaseGenerator.py#L107 Should be just a matter of adapting the code from the confidence mask https://github.com/jgraving/DeepPoseKit/blob/master/deepposekit/callbacks.py#L138 However, I suppose if users wanted an additional validation augmenter feature down the line it would be better for the code to be more general. Not sure how valuable a feature like that is though, as it would apply non-deterministic transformations during validation. |
Ok, I just wasn’t sure because if we are modifying
utils.keypoints.keypoint_errors function, I worried it might be used at
other places in the code.
So it sounds like the thing to do is to modify that function so it
identifies non-visible points by their large negative value. Then it sets
all errors for those points to NaN. Then the logger can be changed to mask
out NaN errors, and the NaN indexes can also be used to report confidence
scores separately for visible and non-visible points.
…On Wed, Apr 15, 2020 at 10:58 AM Jake Graving ***@***.***> wrote:
There are three places changes could be implemented: (1) Logger callback
on_epoch_end; (2) base_model.evaluate; (3) utils.keypoints.keypoint_errors
function. (3) makes the most sense to me, but wonder what you think? Also,
I noticed that there is a confidence mask option built into the logger
already, so this is possible but not ideal solution I think.
Right, I agree. The confidence mask option was added as a quick solution,
because it was just a couple lines of code. Another consideration is you
want to ensure when evaluating the model that it predicts low confidence
scores for points that aren't visible, but this perhaps that type of
evaluation should be separate from the Euclidean error score.
In any case, the problem is that we need to identify which points are not
visible for each frame, but NaN values have been converted to large
negative numbers. Naively, I'd say we could just identify any keypoint
predictions which are negative and mask w/ those. However, is it possible
that an augmentation pipeline could make some true annotation points take
on negative values? If so, either we could hack it using some 'large'
negative threshold, or more principled approach would be to modify
base_model.evaluate to grab keypoints without conversion to NaN?
Augmentation isn't applied during validation, so the non-visible keypoints
in the validation set will always be set to a static value and should be
easy to detect.
https://github.com/jgraving/DeepPoseKit/blob/master/deepposekit/io/BaseGenerator.py#L107
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#27 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJNQGQC6IRNJ7XBHQGLTA3RMXDRZANCNFSM4JM4KV6Q>
.
|
Just made a pull request for this issue, plus a few minor things that came up in implementing it |
@jgraving Hi, is it possible to annotate multiple object instances with keypoints in the annotator? I have a dataset in which each image contains more than one object instance but the annotator just annotates the keypoints of one object. Is this feature supported? |
So when annotating, sometimes keypoints are either not visible due to the animal's posture, or in some cases literally missing (locusts can easily lose a limb).
It would be good to have a clear way of flagging these in the training data. For now, I'm leaving them at the starting co-ordinate (which looks to be (1,1) or thereabouts), but I don't know how this affects the training process.
Would it be possible to have a way of actually setting them to NaN or something like that? Or does leaving them as the default point serve the same effect during training?
The text was updated successfully, but these errors were encountered: