Skip to content

A paper list of automatic video editing and its related computer vision tasks.

Notifications You must be signed in to change notification settings

wentianli/awesome-video-editing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

17 Commits
Β 
Β 

Repository files navigation

awesome-video-editing Awesome

A paper list of automatic video editing and its related computer vision tasks.

The papers are put into categories, in which there is unavoidably some overlapping and imprecision. I use some icons to mark several frequent application scenarios: πŸ’¬(talk/meeting), πŸ’ƒ(dance/performance), βš½πŸ€πŸŒοΈπŸŽΎ(sports), πŸ›’(ads/promotional videos), 🎬(movie), etc.

Note: This paper list does not include the works on image/video manipulation (e.g. content editing, object removal, video stylization).


Text-based Video Editing

LLM-Powered Editing.

  • [IUI 2024] LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing [paper]

Using texts as input to automatically create video sequences from a collection of videos or images.

  • [ICMR 2023] Shot Retrieval and Assembly with Text Script for Video Montage Generation [paper] [code]
  • [MM 2022] Transcript to Video: Efficient Clip Sequencing from Texts [paper] [project page]
  • [CHI 2020] Generating Audio-Visual Slideshows from Text Articles Using Word Concreteness [paper]
  • [TOG 2019] Write-A-Video: Computational Video Montage from Themed Text [paper]
  • [TMM 2020] Story-driven Video Editing [paper]
  • [IMX 2020] Joint Attention for Automated Video Editing [paper] πŸ’¬
  • [UIST 2016] QuickCut: An Interactive Tool for Editing Narrated Video [paper]

Modifying the transcript of a speech to change the speech content or to remove filler words.

  • [TOG 2019] Text-based Editing of Talking-head Video [paper] πŸ’¬
  • [TOG 2012] Tools for Placing Cuts and Transitions in Interview Video [paper] πŸ’¬

Cutting and Sequencing Shots

To cut unedited videos into shots and/or to put them in an appropriate order.

  • [ICASSP 2024] Audio Match Cutting: Finding and Creating Matching Audio Transitions in Movies and Videos [paper] [project page]
  • [CVPR 2024] Towards Automated Movie Trailer Generation [paper] 🎬
  • [MM 2023] A Reinforcement Learning-Based Automatic Video Editing [paper] 🎬
  • [ICCV 2023 Workshop] Representation Learning of Next Shot Selection for Vlog Editing [paper]
  • [WACV 2023] Match Cutting: Finding Cuts with Smooth Visual Transitions [paper] [code] [project page] 🎬
  • [ACCV 2022] Multi-modal Segment Assemblage Network for Ad Video Editing with Importance-Coherence Reward [paper] [code] πŸ›’
  • [SAC 2022] Automated Video Editing Based on Learned Styles Using LSTM-GAN [paper] πŸ’ƒ
  • [ICCV 2021 Workshop] Learning Where To Cut From Edited Videos [paper]
  • [ICCV 2021] Learning to Cut by Watching Movies [paper] [code & dataset] [project page]
  • [WACV 2018] Learning Video-Story Composition via Recurrent Neural Network [paper]
  • [arxiv 2018] From Trailers to Storylines: An Efficient Way to Learn from Movies [paper] 🎬
  • [CVPR 2016] Video-Story Composition via Plot Analysis [paper]

Multi-Camera / Multi-Take Editing

To select video shots from multiple camera views or multiple takes of the same event.

  • [MIG 2023] Real-time Computational Cinematographic Editing for Broadcasting of Volumetric-captured events: an Application to Ultimate Fighting [paper] πŸ₯Š
  • [TOG 2022] PopStage: The Generation of Stage Cross-Editing Video Based on Spatio-Temporal Matching [paper] [project page] πŸ’ƒ
  • [ECCV 2022 Workshop] Temporal and Contextual Transformer for Multi-Camera Editing of TV Shows [paper]
  • [ICME 2021] Reinforcement Learning Based Automatic Personal Mashup Generation [paper]
  • [TOMCCAP 2021] Smart Director: An Event-Driven Directing System for Live Broadcasting [paper] ⚽
  • [ICISP 2018] Automatic Camera Selection in the Context of Basketball Game [paper] πŸ€
  • [TOG 2017] Computational Video Editing for Dialogue-Driven Scenes [paper] πŸ’¬
  • [ACE 2017] Automatic System for Editing Dance Videos Recorded Using Multiple Cameras [paper] πŸ’ƒ
  • [TOG 2014] Automatic Editing of Footage from Multiple Social Cameras [paper]
  • [CHI 2008] Improving Meeting Capture by Applying Television Production Principles with Audio and Motion Detection [paper] πŸ’¬
  • [ICME 2007] Automatic Multi-Modal Meeting Camera Selection for Video-Conferences and Meeting Browsers [paper] πŸ’¬

Video Summarization & Highlight Detection

  • [CVPR 2023] Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies [paper] [code] 🎬
  • [NeurIPS 2022 Workshop] Videogenic: Video Highlights via Photogenic Moments [paper] [project page]
  • [AutoUI 2021] Automatic Generation of Road Trip Summary Video for Reminiscence and Entertainment using Dashcam Video [paper]
  • [MM 2021] Automated Multi-Modal Video Editing for Ads Video [paper] πŸ›’
  • [MM 2021] VideoDiscovery: An Automatic Short-Video Generation System for E-commerce Live-streaming [paper] [project page] πŸ›’
  • [ECCV 2020] Learning Trailer Moments in Full-Length Movies [paper] 🎬
  • [MMAsia 2019] Domain Specific and Idiom Adaptive Video Summarization [paper] πŸ›’
  • [MM 2019] Personalized Video Summarization with Idiom Adaptation [paper] πŸ›’
  • [MM 2019] Generating 1 Minute Summaries of Day Long Egocentric Videos [paper] [code]
  • [TMM 2019] Automatic Curation of Sports Highlights Using Multimodal Excitement Features [paper] 🏌️🎾
  • [ICNC-FSKD 2019] Towards Data-Driven Automatic Video Editing [paper] 🎬
  • [CVPR 2018 Workshop] The Excitement of Sports: Automatic Highlights Using Audio/Visual Cues [paper] 🏌️🎾
  • [CVPR 2013] Story-Driven Summarization for Egocentric Video [paper]
  • [MM 2003] AVE: automated home video editing [paper]

Other Forms of Editing

  • [arxiv 2024] VCoME: Verbal Video Composition with Multimodal Editing Effects [paper] [code] πŸ’¬
  • [CHI 2024] ChunkyEdit: Text-first video interview editing via chunking [paper] πŸ’¬
  • [IUI 2024] ExpressEdit: Video Editing with Natural Language and Sketching [paper] [code] [project page]
  • [UIST 2023] Automated Conversion of Music Videos into Lyric Videos [paper] [project page]
  • [NeurIPS 2022 Workshop] VideoMap: Video Editing in Latent Space [paper] [project page]
  • [ECCV 2022] AutoTransition: Learning to Recommend Video Transition Effects [paper] [code] [dataset]
  • [IJCAI 2020 Demonstrations Track] An AI-Empowered Visual Storyline Generator [paper] πŸ›’
  • [AAAI 2020 Student Abstract] Generating Engaging Promotional Videos for E-commerce Platforms [paper] πŸ›’
  • [UIST 2020] Automatic Video Creation From a Web Page [paper]
  • [TOM 2020] AutoFoley: Artificial Synthesis of Synchronized Sound Tracks for Silent Videos With Deep Learning [paper]
  • [CHI 2019] B-Script: Transcript-based B-roll Video Editing with Recommendations [paper]

Fast-Forwarding & Retiming

To change the video speed.

  • [PRL 2023] A Multimodal Hyperlapse Method Based on Video and Songs' Emotion Alignment [paper]
  • [TPAMI 2023] Text-Driven Video Acceleration: A Weakly-Supervised Reinforcement Learning Method [paper] [project page]
  • [CVPR 2022 Workshop] Video-ReTime: Learning Temporally Varying Speediness for Time Remapping [paper]
  • [TPAMI 2020] A Sparse Sampling-Based Framework for Semantic Fast-Forward of First-Person Videos [paper]
  • [MM 2020] Automated Aesthetic Enhancement of Videos [paper] πŸ’ƒ
  • [CVPR 2018] A Weighted Sparse Sampling and Smoothing Frame Transition Approach for Semantic Fast-Forward First-Person Videos [paper] [project page]
  • [ECCV 2016 Workshop] Towards Semantic Fast-Forward and Stabilized Egocentric Videos [paper]
  • [ICIP 2016] Fast-Forward Video Based on Semantic Extraction [paper]
  • [TOG 2015] Real-Time Hyperlapse Creation via Optimal Frame Selection [paper]
  • [TOG 2014] First-Person Hyper-Lapse Videos [paper]

Music-Driven Editing

  • [arxiv 2023] AutoMatch: A Large-scale Audio Beat Matching Benchmark for Boosting Deep Learning Assistant Video Editing [paper]
  • [SIBGRAPI 2021] Musical Hyperlapse: A Multimodal Approach to Accelerate First-Person Videos [paper]
  • [CVPR 2018 Workshop] Visual Rhythm and Beat [paper]
  • [TOG 2015] audeosynth: Music-Driven Video Montage [paper]

Spatial Editing

To crop the video based on actionness, aesthetics, etc.

  • [arxiv 2024] Reframe Anything: LLM Agent for Open World Video Reframing [paper]
  • [WACV 2024] Real Time GAZED: Online Shot Selection and Editing of Virtual Cameras from Wide-Angle Monocular Video Recordings [paper]
  • [CVPR 2020 Workshop] As Seen on TV: Automatic Basketball Video Production Using Gaussian-Based Actionness and Game States Recognition [paper] [project page] πŸ€
  • [CHI 2020] GAZED– Gaze-guided Cinematic Editing of Wide-Angle Monocular Video Recordings [paper] πŸ’ƒ
  • [SA 2017 Poster] Aesthetic Temporal and Spatial Editing of Casual Videos [paper]

Video Editing Styles Transfer

To extract the editing styles in a source video and apply them to other video footages.

  • [CVPR 2023] JAWS: Just A Wild Shot for Cinematic Transfer in Neural Radiance Fields [paper] [project page]
  • [CVPR 2021 Workshop] Editing Like Humans: A Contextual, Multimodal Framework for Automated Video Editing [paper] [project page] πŸ’¬
  • [CVPR 2021 Workshop] Automatic Non-Linear Video Editing Transfer [paper]

Virtual Cinematography

  • [CVPR 2024] Cinematic Behavior Transfer via NeRF-based Differentiable Filming [paper] [project page]
  • [SIGGRAPH 2023 Poster] Dynamic Storyboard Generation in an Engine-based Virtual Environment for Video Production [paper] [project page]
  • [CHI 2021] Virtual Camera Layout Generation using a Reference Video [paper]
  • [TOMCCAP 2018] Thinking Like a Director: Film Editing Patterns for Virtual Cinematographic Storytelling [paper]

Datasets And More

Datasets and papers related to video editing, camera movementπŸŽ₯, shot typeπŸ–ΌοΈ, etc.

  • [arxiv 2024] Edit3K: Universal Representation Learning for Video Editing Components [paper]
  • [CVPR 2024] Neighbor Relations Matter in Video Scene Detection [paper] [code]
  • [WACV 2024] Movie Genre Classification by Language Augmentation and Shot Sampling [paper] [code] 🎬
  • [IMXw 2023] Recognition of Camera Angle and Camera Level in Movies from Single Frames [paper] [project page] πŸŽ¬πŸ–ΌοΈ
  • [ICCV 2023 Workshop] LEMMS: Label Estimation of Multi-feature Movie Segments [paper] πŸŽ¬πŸ–ΌοΈ
  • [ICCV 2023] Long-range Multimodal Pretraining for Movie Understanding [paper] 🎬
  • [ECCV 2022 Workshop] Movie Lens: Discovering and Characterizing Editing Patterns in the Analysis of Short Movie Sequences [paper] 🎬
  • [ECCV 2022] The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing [paper] [code & dataset] 🎬πŸŽ₯πŸ–ΌοΈ
  • [ECCV 2022] MovieCuts: A New Dataset and Benchmark for Cut Type Recognition [paper] [code & dataset] 🎬
  • [ICIP 2022] HISTORIAN: A Large-Scale HISTORIcal Film Dataset with Cinematographic ANnotation [paper] [code & dataset] 🎬πŸŽ₯
  • [ICIS Fall 2021] RO-TextCNN Based MUL-MOVE-Net for Camera Motion Classification [paper] [code & dataset] πŸŽ₯
  • [ICCV 2021 Workshop] High-Level Features for Movie Style Understanding [paper] 🎬πŸŽ₯
  • [ECCV 2020] MovieNet: A Holistic Dataset for Movie Understanding [paper] [code] [project page & dataset] 🎬πŸŽ₯πŸ–ΌοΈ
  • [ECCV 2020] A Unified Framework for Shot Type Classification Based on Subject Centric Lens [paper] [project page & dataset] 🎬πŸŽ₯πŸ–ΌοΈ
  • [ICIP 2011] Using Context Saliency For Movie Shot Classification [paper] πŸŽ¬πŸ–ΌοΈ

About

A paper list of automatic video editing and its related computer vision tasks.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published