Benchmark script + improvements and bug fixes #46

juanmc2005 · 2022-05-04T16:13:41Z

This PR addresses issues #35 and #39

Changelog

Add diart.benchmark script for fast inference, evaluation and real-time latency estimation of the pipeline (issue Add batched mode for faster inference on pre-recorded conversations #35). Only for the standard pipeline for now, custom pipelines will have to modify some internals until this is improved.
Add GPU compatibility for diart.benchmark and diart.demo with --gpu
Add real-time latency estimation to FileAudioSource with the profile argument (issue Benchmarking operator #39)
Add ChunkLoader to centralize all file chunking
Add tqdm progress operator: observable.pipe(dops.progress()).subscribe(...)
Add PipelineConfig to encapsulate the configuration of the pipeline
Add --no-plot argument to diart.demo to skip plotting (as mentioned in Question about reproduce the result #36)
Make blocks in functional compatible with batched inference
Fix resolution bug in FrameWiseModel always using chunk duration from model training instead of the duration of the given chunk
Make models compatible with torch.Tensor and numpy.ndarray inputs (SlidingWindowFeature compatibility is kept but only for non-batched mode)
Add OnlineSpeakerTracking as a submodule of the pipeline including clustering and output reconstruction
Add from_file() method to OnlineSpeakerDiarization to run batched inference, as opposed to from_source() that runs online inference
Sources can now provide the number of chunks that they will emit (if known) via the length: Optional[int] property

Notes

Benchmarked performance on AMI test with latency=5: DER = 27.3
Benchmarked performance on VoxConverse test with latency=5: DER = 17.1

…zation into feat/batched

… progress operator. Add chunk number estimation without loading chunks

…egmentation model. Cleaning and refactoring

… to emit the current time for profiling

…oader

…sion to disappear. Improve benchmark script help

…sion when batch size is 1

juanmc2005 added 22 commits April 22, 2022 14:18

Start refactoring for batched diarization pipeline

863b9bb

Batchify FrameWiseModel and ChunkWiseModel

d6a4d4b

Add batched pipeline implementation

35a4ffe

Move pre-calculated pipeline to OnlineSpeakerDiarization.from_file()

6ab9a21

Add argument to skip plotting for faster inference in demo script

fe1d5e8

Merge branch 'develop' of github.com:juanmc2005/StreamingSpeakerDiari…

63a04a6

…zation into feat/batched

Remove empty line

3907efb

Add benchmark script. Add optional verbosity to from_file(). Add tqdm…

66a15ac

… progress operator. Add chunk number estimation without loading chunks

Dumb down PipelineConfig. Make sample rate completely depend on the s…

97a4f43

…egmentation model. Cleaning and refactoring

Fix segmentation resolution not being adapted to chunk duration

63f24a8

Add DER evaluation to benchmark script. Add FileAudioSource parameter…

dff01e6

… to emit the current time for profiling

Add optional processing time profiling in FileAudioSource

974e5c2

Add GPU support in demo and benchmarking

0aa0778

Make reference optional in benchmarking script

fdc6f04

Calculate number of chunks from duration instead of samples in ChunkL…

7656f32

…oader

Fix bug in batched pipeline: an edge case was causing the batch dimen…

ab41ebd

…sion to disappear. Improve benchmark script help

Fix bug in from_file(): segmentation and embedding remove batch dimen…

c1e725c

…sion when batch size is 1

Fix end time bug in batched pipeline

354cd29

Centralize stream end time calculation

d228d55

Add diart.benchmark in readme file

2981143

Add pyannote.metrics performance report in diart.benchmark

11009d2

Add progress bar to demo script

7b9291e

juanmc2005 added bug Something isn't working feature New feature or request labels May 4, 2022

juanmc2005 added this to the Version 0.3 milestone May 4, 2022

juanmc2005 self-assigned this May 4, 2022

juanmc2005 requested a review from hbredin May 4, 2022 16:14

Fix method docstring

6b67831

juanmc2005 merged commit 7ecae25 into develop May 8, 2022

juanmc2005 deleted the feat/batched branch May 8, 2022 15:54

This was referenced May 8, 2022

Add batched mode for faster inference on pre-recorded conversations #35

Closed

Question about reproduce the result #36

Closed

Benchmarking operator #39

Closed

juanmc2005 mentioned this pull request May 18, 2022

Version 0.3.0 #56

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark script + improvements and bug fixes #46

Benchmark script + improvements and bug fixes #46

juanmc2005 commented May 4, 2022

Benchmark script + improvements and bug fixes #46

Benchmark script + improvements and bug fixes #46

Conversation

juanmc2005 commented May 4, 2022

Changelog

Notes