The current landscape lacks a robust tool for diarization visualization, which is critical for the analysis of datasets and algorithm outcomes. In this repository, we offer intuitive methods to illustrate speaker diarization results. A pivotal criterion for selecting this visualization software was its capacity for interactive operation. While these visualization tools have room for improvement, they are the best available options at present.
Go to: Visualization tool for Audio-only datasets
Go to: Visualization tool for Audio-visual datasets
python audio_visualized.py -rttm audio_cases/afjiv.rttm -audio_path audio_cases/afjiv.wav -praat_result audio_cases/afjiv.txt
rttm
--- the reference or system rttmaudio_path
--- the audio pathpraat_result
--- visualized result for praat software
(Example is from VoxConverse)
You can slide with a horizontal scroll. Speaker labels are shown in each timeline (e.g., spk00
, spk01
...).
Some useful shortcuts:
CMD + A
: Show all utterances in one screen.CMD + N
: Dive into selected areas.
python audio_visual_visualized.py -rttm audio_visual_cases/00115.rttm -mp4_path audio_visual_cases/00115.rttm -via_json_result audio_visual_cases/00115.json
rttm
--- the reference or system rttmmp4_path
--- the mp4 pathvia_json_result
--- visualized result for VIA software
(Example is from MSDWild)
If the video cannot be previewed or quickly previewed, please try to convert them to support the specific mp4 format of HTML5.
ffmpeg -i original.mp4 -vcodec libx264 -acodec aac -preset fast -movflags +faststart previewed.mp4
- Download
via_video_annotator.html
from URL or directly use a online demo. This website is an offline client, and we have tested on versionvia-3.0.11
(see file:via_video_annotator_3.0.11.html
in this repo). - Import JSON by clicking the
folder button
as follows: - You can also modify the script to support online URLs from OSS (Object Storage Service).
You can use the Space
key to control Play/Pause Media.
More keys can be found on: