This repo contains scripts for downloading visual features, subtitles and annotations of all VALUE tasks.
Due to copyright issue, we could not release the raw videos. However, we will also provide all the YouTube ids/TV episode versions along with their original timestamps to facilitate future end-to-end training on VALUE benchmark.
- Visual features
- ResNet
- SlowFast
- MIL-NCE-S3D
- CLIP-ViT
- Subtitles
- Annotations
- Original video ids and timestamps for YouTube videos
Please see DATA.md.
We extract frame-level features at a fixed frame rate (1 feature every 1.5 seconds) and save them into .npz file per video. To reproduce the feature extraction process, please follow the instructions and code released at here.
Our features are released under CC BY-NC-SA 4.0 license. For annotations, please see DATA.md.