Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on clip_length Parameter for Reproducing EK100 Action Recognition Results #15

Open
Geneam opened this issue Dec 28, 2024 · 4 comments

Comments

@Geneam
Copy link

Geneam commented Dec 28, 2024

Dear Author,

I would like to ask: when reproducing the action recognition results for EK100, should the parameter clip_length be set to 16? I used the evaluation command from model.md as follows:

mkdir $EXP_PATH
PYTHONPATH=.:third_party/decord/python/ torchrun \
    --nproc_per_node=8 scripts/main_lavila_finetune_cls.py \
    --root datasets/EK100/EK100_320p_15sec_30fps_libx264/ \
    --video-chunk-length 15 --use-flash-attn \
    --grad-checkpointing \
    --use-fast-conv1 \
    --batch-size 64 \
    --fused-decode-crop \
    --use-multi-epochs-loader \
    --pretrain-model experiments/pretrain_lavila_vitb/checkpoint_best.pt \
    --resume ${PATH_TO}/avion_finetune_cls_lavila_vitb_best.pt \  # additional to the training script
    --evaluate                                                    # additional to the training script

The results I obtained were about 7% lower than reported. For the above command, the clip_length parameter seems to be set to 4.

Could you please clarify the value of clip_length used in the results you reported? Additionally, does this parameter significantly impact the model’s performance?

Looking forward to your response.

@dhimitriosduka1
Copy link

Dear @Geneam,

I know that this is a bit on an unrelated question, but I was curious if during the installation of the env, you faced a problem with building the decord module. In my case, I got an error related to AVBSFContext from ffmpeg_common.h:187:5.

If you encountered such a problem, how did you manage to solve it?

Looking forward to your response.

@Geneam
Copy link
Author

Geneam commented Jan 14, 2025

Dear @Geneam,

I know that this is a bit on an unrelated question, but I was curious if during the installation of the env, you faced a problem with building the decord module. In my case, I got an error related to AVBSFContext from ffmpeg_common.h:187:5.

If you encountered such a problem, how did you manage to solve it?

Looking forward to your response.

Sorry for the late reply. I remember installing decord module following INSTALL.md, and I didn't encounter the issue you mentioned. Have you managed to resolve the problem? If not, could you provide more details? For example, which installation command were you running when the error occurred, and what was the exact error message?

@zhaoyue-zephyrus
Copy link
Owner

Hi @Geneam sorry for the late reply.

Could you please clarify the value of clip_length used in the results you reported?

The clip length is always 16 when we measure the fine-tuned classification accuracy. A shorter clip length (e.g. 4) will likely perform worse but 7% sounds a bit much. Could you try 16 instead?

@zhaoyue-zephyrus
Copy link
Owner

Hi @dhimitriosduka1

Dear @Geneam,

I know that this is a bit on an unrelated question, but I was curious if during the installation of the env, you faced a problem with building the decord module. In my case, I got an error related to AVBSFContext from ffmpeg_common.h:187:5.

If you encountered such a problem, how did you manage to solve it?

Looking forward to your response.

Might be relevant to #15 (comment). Can you try ffmpeg 4? Or there is a new PR to support ffmpeg>5.0. I haven't got time to test it yet but it might worth trying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants