-
Notifications
You must be signed in to change notification settings - Fork 18
Warning from clu.metrics when running evaluation #4
Comments
Hi @YingtianDt . Could you kindly share your code on how you performed the inference/predictions and visualized the results? I was able to train the model. But I am not able to figure out how to use the output from the resulting directory (which has the checkpoints and plugins subfolders from training ). So how do I use the trained model to test it on new video data and visualize the predictions? |
Hi @nilushacj . I am also working on it, so I don't have a complete code for this now. But I suggest to check out lib/trainer/evaluate.py:300, where the function "evaluate" uses a writer to first record the evaluation metrics (write_scalars) and then the predictions (write_images). It seems to generate a file for tensorboard, but I didn't use it. |
Hi @YingtianDt . Would you happen to know specifically which file it is? Basically when I open Tensorboard with the output checkpoint folder as the log directory, I am able to visualize the results of the training there. However, I am looking for the actual matrices/tensors corresponding to the predictions so that I can save each output video (segmentations, flow and bounding boxes) and visualize it. As you correctly pointed out, I have been looking at the evaluate function (in both the trainer and evaluator) given in the code. I am not quite sure how I can use the output from "writer.write_images" and the CLU documentation for this is also a bit unclear to me. I printed the keys and value shapes for the dictionary which is obtained by the "jax.tree_map" that is used as an input to the "writer.write_images" (screenshot attached below). If you look at the keys "video", "segmentations", "flow" and "boxes" in the image attached, the shape is (5, 64, 320, 3) for all 4 of them. Here, I believe that the 1st element (5) is the number of frames, 2nd element (64) is the batch size and the 4th element (3) is the number of channels. However, what is the 3rd element (320)? |
Hi again @YingtianDt . So for the second part of my message earlier, I will go ahead and answer it myself :D. I believe this shape corresponds to an image grid which is obtained from the video arrays (from the function "video_to_image_grid" in "utils.py") |
Hi @YingtianDt, thanks a lot for bringing this warning to our attention! You can ignore this warning for the sake of model training, but it highlights that the displayed Let me try to explain what's going on: to make sure we can evaluate on all validation/test set examples irrespective of our batch size (while keeping the batch size constant for all individual forward passes through the model), we append a "dummy" examples at the end of the validation/test set which are labeled using the Fixing this requires a slight refactoring of our loss functions. Since this doesn't have any effect on model training and since we do not report |
@tkipf Thank you very much for your reply! I will definitely watch out the loss metric. Please feel free to close the issue if there is no more questions from other people. @nilushacj Hi, I think for the first part of your question, the outputs from |
Hi, thank you very much for your wonderful work!
I just tried to run it on MOVI-A dataset. At the evaluation time, the CLU.metrics gives a logging as (I adjust the batch size to be 8):
metrics.py:232] Ignoring mask for model output 'loss' because of shape mismatch: output.shape=() vs. mask.shape=(8,)
I read through the code and think this is fine. But I am writting to ensure this.
Also, can I ask if my evaluation metrics on MOVI-A look fine? -- eval_ari=0.8829; eval_ari_nobg=0.9373; eval_loss=28674.48.
Thank you in advance.
The text was updated successfully, but these errors were encountered: