awesome-video-editing

A paper list of automatic video editing and its related computer vision tasks.

The papers are put into categories, in which there is unavoidably some overlapping and imprecision. I use some icons to mark several frequent application scenarios: 💬(talk/meeting), 💃(dance/performance), ⚽🏀🏌️🎾(sports), 🛒(ads/promotional videos), 🎬(movie), etc.

Note: This paper list does not include the works on image/video manipulation (e.g. content editing, object removal, video stylization).

Text-based Video Editing

LLM-Powered Editing.

[IUI 2024] LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing [paper]

Using texts as input to automatically create video sequences from a collection of videos or images.

[ICMR 2023] Shot Retrieval and Assembly with Text Script for Video Montage Generation [paper] [code]
[MM 2022] Transcript to Video: Efficient Clip Sequencing from Texts [paper] [project page]
[CHI 2020] Generating Audio-Visual Slideshows from Text Articles Using Word Concreteness [paper]
[TOG 2019] Write-A-Video: Computational Video Montage from Themed Text [paper]
[TMM 2020] Story-driven Video Editing [paper]
[IMX 2020] Joint Attention for Automated Video Editing [paper] 💬
[UIST 2016] QuickCut: An Interactive Tool for Editing Narrated Video [paper]

Modifying the transcript of a speech to change the speech content or to remove filler words.

[TOG 2019] Text-based Editing of Talking-head Video [paper] 💬
[TOG 2012] Tools for Placing Cuts and Transitions in Interview Video [paper] 💬

Cutting and Sequencing Shots

To cut unedited videos into shots and/or to put them in an appropriate order.

[ICASSP 2024] Audio Match Cutting: Finding and Creating Matching Audio Transitions in Movies and Videos [paper] [project page]
[CVPR 2024] Towards Automated Movie Trailer Generation [paper] 🎬
[MM 2023] A Reinforcement Learning-Based Automatic Video Editing [paper] 🎬
[ICCV 2023 Workshop] Representation Learning of Next Shot Selection for Vlog Editing [paper]
[WACV 2023] Match Cutting: Finding Cuts with Smooth Visual Transitions [paper] [code] [project page] 🎬
[ACCV 2022] Multi-modal Segment Assemblage Network for Ad Video Editing with Importance-Coherence Reward [paper] [code] 🛒
[SAC 2022] Automated Video Editing Based on Learned Styles Using LSTM-GAN [paper] 💃
[ICCV 2021 Workshop] Learning Where To Cut From Edited Videos [paper]
[ICCV 2021] Learning to Cut by Watching Movies [paper] [code & dataset] [project page]
[WACV 2018] Learning Video-Story Composition via Recurrent Neural Network [paper]
[arxiv 2018] From Trailers to Storylines: An Efficient Way to Learn from Movies [paper] 🎬
[CVPR 2016] Video-Story Composition via Plot Analysis [paper]

Multi-Camera / Multi-Take Editing

To select video shots from multiple camera views or multiple takes of the same event.

[MIG 2023] Real-time Computational Cinematographic Editing for Broadcasting of Volumetric-captured events: an Application to Ultimate Fighting [paper] 🥊
[TOG 2022] PopStage: The Generation of Stage Cross-Editing Video Based on Spatio-Temporal Matching [paper] [project page] 💃
[ECCV 2022 Workshop] Temporal and Contextual Transformer for Multi-Camera Editing of TV Shows [paper]
[ICME 2021] Reinforcement Learning Based Automatic Personal Mashup Generation [paper]
[TOMCCAP 2021] Smart Director: An Event-Driven Directing System for Live Broadcasting [paper] ⚽
[ICISP 2018] Automatic Camera Selection in the Context of Basketball Game [paper] 🏀
[TOG 2017] Computational Video Editing for Dialogue-Driven Scenes [paper] 💬
[ACE 2017] Automatic System for Editing Dance Videos Recorded Using Multiple Cameras [paper] 💃
[TOG 2014] Automatic Editing of Footage from Multiple Social Cameras [paper]
[CHI 2008] Improving Meeting Capture by Applying Television Production Principles with Audio and Motion Detection [paper] 💬
[ICME 2007] Automatic Multi-Modal Meeting Camera Selection for Video-Conferences and Meeting Browsers [paper] 💬

Video Summarization & Highlight Detection

[CVPR 2023] Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies [paper] [code] 🎬
[NeurIPS 2022 Workshop] Videogenic: Video Highlights via Photogenic Moments [paper] [project page]
[AutoUI 2021] Automatic Generation of Road Trip Summary Video for Reminiscence and Entertainment using Dashcam Video [paper]
[MM 2021] Automated Multi-Modal Video Editing for Ads Video [paper] 🛒
[MM 2021] VideoDiscovery: An Automatic Short-Video Generation System for E-commerce Live-streaming [paper] [project page] 🛒
[ECCV 2020] Learning Trailer Moments in Full-Length Movies [paper] 🎬
[MMAsia 2019] Domain Specific and Idiom Adaptive Video Summarization [paper] 🛒
[MM 2019] Personalized Video Summarization with Idiom Adaptation [paper] 🛒
[MM 2019] Generating 1 Minute Summaries of Day Long Egocentric Videos [paper] [code]
[TMM 2019] Automatic Curation of Sports Highlights Using Multimodal Excitement Features [paper] 🏌️🎾
[ICNC-FSKD 2019] Towards Data-Driven Automatic Video Editing [paper] 🎬
[CVPR 2018 Workshop] The Excitement of Sports: Automatic Highlights Using Audio/Visual Cues [paper] 🏌️🎾
[CVPR 2013] Story-Driven Summarization for Egocentric Video [paper]
[MM 2003] AVE: automated home video editing [paper]

Other Forms of Editing

[arxiv 2024] VCoME: Verbal Video Composition with Multimodal Editing Effects [paper] [code] 💬
[CHI 2024] ChunkyEdit: Text-first video interview editing via chunking [paper] 💬
[IUI 2024] ExpressEdit: Video Editing with Natural Language and Sketching [paper] [code] [project page]
[UIST 2023] Automated Conversion of Music Videos into Lyric Videos [paper] [project page]
[NeurIPS 2022 Workshop] VideoMap: Video Editing in Latent Space [paper] [project page]
[ECCV 2022] AutoTransition: Learning to Recommend Video Transition Effects [paper] [code] [dataset]
[IJCAI 2020 Demonstrations Track] An AI-Empowered Visual Storyline Generator [paper] 🛒
[AAAI 2020 Student Abstract] Generating Engaging Promotional Videos for E-commerce Platforms [paper] 🛒
[UIST 2020] Automatic Video Creation From a Web Page [paper]
[TOM 2020] AutoFoley: Artificial Synthesis of Synchronized Sound Tracks for Silent Videos With Deep Learning [paper]
[CHI 2019] B-Script: Transcript-based B-roll Video Editing with Recommendations [paper]

Fast-Forwarding & Retiming

To change the video speed.

[PRL 2023] A Multimodal Hyperlapse Method Based on Video and Songs' Emotion Alignment [paper]
[TPAMI 2023] Text-Driven Video Acceleration: A Weakly-Supervised Reinforcement Learning Method [paper] [project page]
[CVPR 2022 Workshop] Video-ReTime: Learning Temporally Varying Speediness for Time Remapping [paper]
[TPAMI 2020] A Sparse Sampling-Based Framework for Semantic Fast-Forward of First-Person Videos [paper]
[MM 2020] Automated Aesthetic Enhancement of Videos [paper] 💃
[CVPR 2018] A Weighted Sparse Sampling and Smoothing Frame Transition Approach for Semantic Fast-Forward First-Person Videos [paper] [project page]
[ECCV 2016 Workshop] Towards Semantic Fast-Forward and Stabilized Egocentric Videos [paper]
[ICIP 2016] Fast-Forward Video Based on Semantic Extraction [paper]
[TOG 2015] Real-Time Hyperlapse Creation via Optimal Frame Selection [paper]
[TOG 2014] First-Person Hyper-Lapse Videos [paper]

Music-Driven Editing

[arxiv 2023] AutoMatch: A Large-scale Audio Beat Matching Benchmark for Boosting Deep Learning Assistant Video Editing [paper]
[SIBGRAPI 2021] Musical Hyperlapse: A Multimodal Approach to Accelerate First-Person Videos [paper]
[CVPR 2018 Workshop] Visual Rhythm and Beat [paper]
[TOG 2015] audeosynth: Music-Driven Video Montage [paper]

Spatial Editing

To crop the video based on actionness, aesthetics, etc.

[arxiv 2024] Reframe Anything: LLM Agent for Open World Video Reframing [paper]
[WACV 2024] Real Time GAZED: Online Shot Selection and Editing of Virtual Cameras from Wide-Angle Monocular Video Recordings [paper]
[CVPR 2020 Workshop] As Seen on TV: Automatic Basketball Video Production Using Gaussian-Based Actionness and Game States Recognition [paper] [project page] 🏀
[CHI 2020] GAZED– Gaze-guided Cinematic Editing of Wide-Angle Monocular Video Recordings [paper] 💃
[SA 2017 Poster] Aesthetic Temporal and Spatial Editing of Casual Videos [paper]

Video Editing Styles Transfer

To extract the editing styles in a source video and apply them to other video footages.

[CVPR 2023] JAWS: Just A Wild Shot for Cinematic Transfer in Neural Radiance Fields [paper] [project page]
[CVPR 2021 Workshop] Editing Like Humans: A Contextual, Multimodal Framework for Automated Video Editing [paper] [project page] 💬
[CVPR 2021 Workshop] Automatic Non-Linear Video Editing Transfer [paper]

Virtual Cinematography

[CVPR 2024] Cinematic Behavior Transfer via NeRF-based Differentiable Filming [paper] [project page]
[SIGGRAPH 2023 Poster] Dynamic Storyboard Generation in an Engine-based Virtual Environment for Video Production [paper] [project page]
[CHI 2021] Virtual Camera Layout Generation using a Reference Video [paper]
[TOMCCAP 2018] Thinking Like a Director: Film Editing Patterns for Virtual Cinematographic Storytelling [paper]

Datasets And More

Datasets and papers related to video editing, camera movement🎥, shot type🖼️, etc.

[arxiv 2024] Edit3K: Universal Representation Learning for Video Editing Components [paper]
[CVPR 2024] Neighbor Relations Matter in Video Scene Detection [paper] [code]
[WACV 2024] Movie Genre Classification by Language Augmentation and Shot Sampling [paper] [code] 🎬
[IMXw 2023] Recognition of Camera Angle and Camera Level in Movies from Single Frames [paper] [project page] 🎬🖼️
[ICCV 2023 Workshop] LEMMS: Label Estimation of Multi-feature Movie Segments [paper] 🎬🖼️
[ICCV 2023] Long-range Multimodal Pretraining for Movie Understanding [paper] 🎬
[ECCV 2022 Workshop] Movie Lens: Discovering and Characterizing Editing Patterns in the Analysis of Short Movie Sequences [paper] 🎬
[ECCV 2022] The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing [paper] [code & dataset] 🎬🎥🖼️
[ECCV 2022] MovieCuts: A New Dataset and Benchmark for Cut Type Recognition [paper] [code & dataset] 🎬
[ICIP 2022] HISTORIAN: A Large-Scale HISTORIcal Film Dataset with Cinematographic ANnotation [paper] [code & dataset] 🎬🎥
[ICIS Fall 2021] RO-TextCNN Based MUL-MOVE-Net for Camera Motion Classification [paper] [code & dataset] 🎥
[ICCV 2021 Workshop] High-Level Features for Movie Style Understanding [paper] 🎬🎥
[ECCV 2020] MovieNet: A Holistic Dataset for Movie Understanding [paper] [code] [project page & dataset] 🎬🎥🖼️
[ECCV 2020] A Unified Framework for Shot Type Classification Based on Subject Centric Lens [paper] [project page & dataset] 🎬🎥🖼️
[ICIP 2011] Using Context Saliency For Movie Shot Classification [paper] 🎬🖼️

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

awesome-video-editing

Text-based Video Editing

Cutting and Sequencing Shots

Multi-Camera / Multi-Take Editing

Video Summarization & Highlight Detection

Other Forms of Editing

Fast-Forwarding & Retiming

Music-Driven Editing

Spatial Editing

Video Editing Styles Transfer

Virtual Cinematography

Datasets And More

About

Releases

Packages

wentianli/awesome-video-editing

Folders and files

Latest commit

History

Repository files navigation

awesome-video-editing

Text-based Video Editing

Cutting and Sequencing Shots

Multi-Camera / Multi-Take Editing

Video Summarization & Highlight Detection

Other Forms of Editing

Fast-Forwarding & Retiming

Music-Driven Editing

Spatial Editing

Video Editing Styles Transfer

Virtual Cinematography

Datasets And More

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages