Skip to content

Latest commit

 

History

History
23 lines (16 loc) · 809 Bytes

README.md

File metadata and controls

23 lines (16 loc) · 809 Bytes

C$^2$KD: Bridging the Modality Gap for Cross-Modal Knowledge Distillation

Code for paper "C$^2$KD: Bridging the Modality Gap for Cross-Modal Knowledge Distillation".

Usage

Requirements

requirements.txt

Data Preparation

Download Original Dataset: CREMA-D, AVE, VGGSound,

Pre-processing

For AVE, CREMA-D and VGGSound dataset, we provide code to pre-process videos into RGB frames and audio wav files in the directory utils/data/.

Run commands

Detailed descriptions of options can be found in main_overlap_tag.py

  1. Pre-train the single modality model
  2. Conduct crossmodal knowledge distillation