mae_visualize models vs mae_pretrain_full models #12

amirhfarzaneh · 2022-01-13T02:03:09Z

Hello,

thank you for the great work and the great repo. I was playing with different pre-trained models for visualization. When I use a mae_visualize_vit_base.pth I get the reconstruction results as in the demo and the paper such as below:

However when I use the mae_pretrain_vit_base_full.pth checkpoint the results are as below:

mask_ratio=0.75 for both results.
So here are my questions:

Can you please clarify what is the difference between visualize and full checkpoints and why the results look worse with full checkpoints?
If I want to finetune an MAE model (both encoder and decoder parts) for reconstruction on a custom dataset, which checkpoint is recommended?

I would appreciate it if you could help me with these questions.

The text was updated successfully, but these errors were encountered:

KaimingHe · 2022-01-13T02:24:49Z

As noted in the issue where you find this checkpoint (#8), mae_pretrain_vit_base_full.pth is trained with normalized pixels (see Table 1d in paper). So its reconstruction produces results that are normalized (for each patch). What you see are the correct reconstruction. If you do the same normalization on the ground-truth image (https://github.com/facebookresearch/mae/blob/main/models_mae.py#L205), you can see what the model is expected to reconstruct.

mae_visualize_vit_base.pth is trained with unnormalized pixels. It is the default in all results in Table 1 (except 1d). It is slightly worse in terms of representation quality (e.g., classification results).

If your goal is to reconstruct a good-looking image, use unnormalized pixels. If your goal is to finetune for a downstream recognition task, use normalized pixels.

amirhfarzaneh · 2022-01-13T02:40:18Z

That makes total sense. I missed the norm_pix_loss. Thanks for the clarification @KaimingHe

KaimingHe closed this as completed Jan 13, 2022

guilk mentioned this issue Aug 1, 2022

imagenet finetuned guilk/VLC#4

Closed

surajraj99 mentioned this issue Mar 5, 2024

norm_pix_loss question parthagrawal02/MAE_GAN#2

Closed

junkunyuan mentioned this issue Mar 25, 2024

Reconstructions Visualization of LUPerson junkunyuan/HAP#10

Closed

bqm1111 mentioned this issue Apr 24, 2024

Loss goes horizontally after 200 epoch Zian-Xu/Swin-MAE#11

Open

rawalkhirodkar mentioned this issue Sep 24, 2024

MAE results do not match with the paper or am I using it wrong? facebookresearch/sapiens#108

Closed

Dstudying mentioned this issue Dec 25, 2024

Reconstructed image darkening #209

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mae_visualize models vs mae_pretrain_full models #12

mae_visualize models vs mae_pretrain_full models #12

amirhfarzaneh commented Jan 13, 2022 •

edited

Loading

KaimingHe commented Jan 13, 2022

amirhfarzaneh commented Jan 13, 2022

mae_visualize models vs mae_pretrain_full models #12

mae_visualize models vs mae_pretrain_full models #12

Comments

amirhfarzaneh commented Jan 13, 2022 • edited Loading

KaimingHe commented Jan 13, 2022

amirhfarzaneh commented Jan 13, 2022

amirhfarzaneh commented Jan 13, 2022 •

edited

Loading