Pytorch implementation of the paper "DMVC: Decpomsed Motion Modeling for Learned Video Compression". T-CSVT 2022.
- Python==3.7
- PyTorch==1.11
This repository defines a learned video compression framework with decomposed motion modeling for low delay coding scenario. The past reconstructions from DPB are used to capture the intrinsic motion at the first place. The intrinsic motion is conveyed along the temporal axis and takes part in the initial temporal transition. Subsequently, the compensatory motion is unsupervisedly learned conditioned on the approximated feature, so as to refine the temporal transition. The core temporal transition is occurred in the feature space. The left pixel-level residue between
This method focuses on the P-frame compression. In terms of I frames, we apply CompressAI to compress them. The test datasets include:
- HEVC common test sequences
- UVG dataset (1080p/8bit/YUV)
- MCLJCV dataset (1080p/8bit/YUV)
Basically, the test sequences are cropped. After that, both the width and height are the multiplier of 64. Subsequently, we split them into consecutive pictures by ffmpeg. Taking UVG as example, the data process is shown as follows.
- Crop Videos from 1920x1080 to 1920x1024.
ffmpeg -pix_fmt yuv420p -s 1920x1080 -i ./videos/xxxx.yuv -vf crop=1920:1024:0:0 ./videos_crop/xxxx.yuv
- Convert YUV files to images.
ffmpg -s 1920x1024 -pix_fmt yuv420p -i ./videos_crop/xxxx.yuv ./images_crop/xxxx/im%3d.png
We respectively train four differnt models for PSNR metric, where
python eval.py --eval_lambda 256 --metric mse --intra_model cheng2020_anchor --test_class ClassD --gop_size 10 --pretrain ./checkpoints/dmvc_psnr_256.model
python eval.py --eval_lambda 8 --metric ms-ssim --intra_model cheng2020_anchor --test_class ClassD --gop_size 10 --pretrain ./checkpoints/dmvc_msssim_8.model
If you find this paper useful, please cite:
@article{lin2022dmvc,
title={DMVC: Decomposed Motion Modeling for Learned Video Compression},
author={Lin, Kai and Jia, Chuanmin and Zhang, Xinfeng and Wang, Shanshe and Ma, Siwei and Gao, Wen},
journal={IEEE Transactions on Circuits and Systems for Video Technology},
year={2022},
publisher={IEEE}
}
- CompressAI: https://github.com/InterDigitalInc/CompressAI
- Benchmark: https://github.com/ZhihaoHu/PyTorchVideoCompression
- OpenDVC: https://github.com/RenYang-home/OpenDVC
- DCVC: https://github.com/DeepMC-DCVC/DCVC
- M-LVC: https://github.com/JianpingLin/M-LVC_CVPR2020
- RLVC: https://github.com/RenYang-home/RLVC