Xiang Chen, Hao Li, Mingqiang Li, and Jinshan Pan
Abstract: Transformers-based methods have achieved significant performance in image deraining as they can model the non-local information which is vital for high-quality image reconstruction. In this paper, we find that most existing Transformers usually use all similarities of the tokens from the query-key pairs for the feature aggregation. However, if the tokens from the query are different from those of the key, the self-attention values estimated from these tokens also involve in feature aggregation, which accordingly interferes with the clear image restoration. To overcome this problem, we propose an effective DeRaining network, Sparse Transformer (DRSformer) that can adaptively keep the most useful self-attention values for feature aggregation so that the aggregated features better facilitate high-quality image reconstruction. Specifically, we develop a learnable top-k selection operator to adaptively retain the most crucial attention scores from the keys for each query for better feature aggregation. Simultaneously, as the naive feed-forward network in Transformers does not model the multi-scale information that is important for latent clear image restoration, we develop an effective mixed-scale feed-forward network to generate better features for image deraining. To learn an enriched set of hybrid features, which combines local context from CNN operators, we equip our model with mixture of experts feature compensator to present a cooperation refinement deraining scheme. Extensive experimental results on the commonly used benchmarks demonstrate that the proposed method achieves favorable performance against state-of-the-art approaches. The source codes are available at https://github.com/cschenxiang/DRSformer.
Dataset | Rain200L | Rain200H | DID-Data | DDN-Data | SPA-Data |
---|---|---|---|---|---|
Baidu Cloud | Download (s2yx) | Download (z9br) | Download (5luo) | Download (ldzo) | Download (yjow) |
- Please download the corresponding training datasets and put them in the folder
Datasets/train
. Download the testing datasets and put them in the folderDatasets/test
. - Note that we do not use MEFC for training Rain200L and SPA-Data, because their rain streaks are less complex and easier to learn. Please modify the file
DRSformer_arch.py
. - Follow the instructions below to begin training our model.
cd DRSformer
bash train.sh
Run the script then you can find the generated experimental logs in the folder experiments
.
- Please download the corresponding testing datasets and put them in the folder
test/input
. Download the corresponding pre-trained models and put them in the folderpretrained_models
. - Note that we do not use MEFC for training Rain200L and SPA-Data, because their rain streaks are less complex and easier to learn. Please modify the file
DRSformer_arch.py
. See the fileDRSformer_arch_200L+SPA.py
. - Follow the instructions below to begin testing our model.
python test.py --task Deraining --input_dir './test/input/' --result_dir './test/output/'
Run the script then you can find the output visual results in the folder test/output/Deraining
.
Dataset | Rain200L | Rain200H | DID-Data | DDN-Data | SPA-Data |
---|---|---|---|---|---|
Baidu Cloud | Download (kzj5) | Download (j10m) | Download (nact) | Download (hj6r) | Download (vfvt) |
Google Drive | Download | Download | Download | Download | Download |
See folder "evaluations"
-
for Rain200L/H and SPA-Data datasets: PSNR and SSIM results are computed by using this Matlab Code.
-
for DID-Data and DDN-Data datasets: PSNR and SSIM results are computed by using this Matlab Code.
Please note that Table 1 above is our final camera-ready version. There exists the slight gap between the final version and the arXiv version due to errors caused by different testing devices and environments. It is recommended that you can download the visual deraining results and retest the quantitative results on your own device and environment.
Dataset | Rain200L | Rain200H | DID-Data | DDN-Data | SPA-Data |
---|---|---|---|---|---|
DualGCN | DWL (v8qy) | DWL (jnc9) | DWL (3gdx) | DWL (1mdx) | DWL (lkeb) |
SPDNet | DWL (y39h) | DWL (mry2) | DWL (klci) | DWL (19bm) | DWL (dd98) |
Uformer | - | - | DWL (4uur) | DWL (39bj) | - |
Restormer | DWL (6a2z) | DWL (9m1r) | DWL (1hql) | DWL (crj4) | DWL (b40z) |
IDT | DWL (v4yd) | DWL (77i4) | DWL (8uxx) | DWL (0ey6) | DWL (b862) |
Ours | DWL (hyuv) | DWL (px2j) | DWL (t879) | DWL (9vtz) | DWL (bl4n) |
For DualGCN, SPDNet, Restormer and IDT, we retrain their models provided by the authors if no pretrained models are provided, otherwise we evaluate them with their online codes. For Uformer, we refer to some reported results in IDT. Noted that since the PSNR/SSIM codes used to test DID-Data and DDN-Data in their paper are different from ours, we retrain the Uformer on the DID-Data and DDN-Data. For other previous methods, we refer to reported results in here with same PSNR/SSIM codes.
If you are interested in this work, please consider citing:
@InProceedings{Chen_2023_CVPR,
author={Chen, Xiang and Li, Hao and Li, Mingqiang and Pan, Jinshan},
title={Learning a Sparse Transformer Network for Effective Image Deraining},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month={June},
year={2023},
pages={5896-5905}
}
This code is based on the Restormer. Thanks for their awesome work.
Should you have any question or suggestion, please contact [email protected].