C$^2$KD: Bridging the Modality Gap for Cross-Modal Knowledge Distillation

Code for paper "C$^2$KD: Bridging the Modality Gap for Cross-Modal Knowledge Distillation".

Usage

Requirements

requirements.txt

Data Preparation

Download Original Dataset： CREMA-D, AVE, VGGSound,

Pre-processing

For AVE, CREMA-D and VGGSound dataset, we provide code to pre-process videos into RGB frames and audio wav files in the directory utils/data/.

Run commands

Detailed descriptions of options can be found in main_overlap_tag.py

Pre-train the single modality model
Conduct crossmodal knowledge distillation