Adversarial Erasing Framework via Triplet with Gated Pyramid Pooling Layer for Weakly Supervised Semantic Segmentation
This repository contains the official PyTorch implementation of the paper "Adversarial Erasing Framework via Triplet with Gated Pyramid Pooling Layer for Weakly Supervised Semantic Segmentation" paper (ECCV 2022) by Sung-Hoon Yoon and Hyeokjun Kweon.
We propose (1) the Gated Pyramid Pooling (GPP) layer to resolve the architectural limitation of classifier (or GAP) and (2) the Adversarial Erasing Framework via Triplet (AEFT) to effectively prevent the over-expansion via triplet, while preserving the benefits of AE. With image-level supervision only, we achieved new state-of-the-arts both on PASCAL VOC 2012 and MS-COCO.
- Tested on Ubuntu 18.04, with Python 3.8, PyTorch 1.8.2, CUDA 11.4, both on both single and multi gpu.
- You can create conda environment with the provided yaml file.
conda env create -f wsss_new.yaml
- The PASCAL VOC 2012 development kit: You need to specify place VOC2012 under ./data folder.
- ImageNet-pretrained weights for resnet38d are from [resnet_38d.params]. You need to place the weights as ./pretrained/resnet_38d.params. Note that we have reuploaded this file, as the file had been damaged.
- Pretrained weight (PASCAL, seed: 56.2% mIoU) can be downloaded here
With the following code, you can generate pseudo labels to train the segmentation network.
This code includes AffinityNet
- Please specify the name of your experiment.
- Training results are saved at ./experiment/[exp_name]
python train.py --name [exp_name] --model aeft_gpp
To train AffinityNet, you must extract the followings: 1.seeds, 2.CRF(low):int , 3.CRF(high):int
Option 1: if you need to train AffinityNet, For faster convergence use the res38_cls.pth. The best setting of AEFT is (low: 2 high: 21 dup:19)
python infer.py --name [exp_name] --model aeft_gpp --load_epo [epoch_to_load] --dict --crf --alphas [crf value to extract, e.g. CRF(low), CRF(high)] --infer_list voc12/train_aug.txt
Option 2: if you only need to get CRF
python infer.py --name [exp_name] --model aeft_gpp --load_epo [epoch_to_load] --dict --crf --alphas [crf value to extract, e.g. 6,7,8] --infer_list voc12/train.txt
The best setting of AEFT is (low: 2 high: 21 dup:19)
python train_aff.py --name [exp_name: Use the same name above] --low CRF(low):int --high CRF(high):int --dup [CRF value for affinity:int]
python infer_aff.py --name [exp_name: Use the same name above] --low CRF(low):int --high CRF(high):int --dup [CRF value for affinity:int] --train_list voc12/train_aug.txt
python evaluation.py --name [exp_name] --task cam --dict_dir dict
python evaluation.py --name [exp_name] --task crf --dict_dir crf/[xx]
If our code be useful for you, please consider citing our ECCV paper using the following BibTeX entry.
@inproceedings{yoon2022adversarial,
title={Adversarial Erasing Framework via Triplet with Gated Pyramid Pooling Layer for Weakly Supervised Semantic Segmentation},
author={Yoon, Sung-Hoon and Kweon, Hyeokjun and Cho, Jegyeong and Kim, Shinjeong and Yoon, Kuk-Jin},
booktitle={European Conference on Computer Vision},
pages={326--344},
year={2022},
organization={Springer}
}
we heavily borrow the work from AffinityNet repository. Thanks for the excellent codes!
## Reference
[1] J. Ahn and S. Kwak. Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.