From b942cdd6ded0a85e6ed0fdc79e22bd23aecd1bd0 Mon Sep 17 00:00:00 2001 From: mrq Date: Wed, 22 Mar 2023 19:28:58 +0000 Subject: [PATCH] Update 'Training' --- Training.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/Training.md b/Training.md index dd51ff1..dc0f6b7 100644 --- a/Training.md +++ b/Training.md @@ -83,6 +83,12 @@ A lot of it should be fairly hand-held, but the biggest point is to double check * **!**NOTE**!**: be very careful with naively trusting how well the audio is segmented. Be sure to manually curate how well they were segmented +### WhisperX + +The web UI also offers support for using [`m-bain/whisperx`](https://github.com/m-bain/whisperX/) as a transcription backend. + +With it, you can leverage its VAD filter, batching, and diarization features for faster and accurate transcriptions. Unfortunately, all of these require a HF token, and accepting agreements (consult the whisperx repo for details on doing that). + ### Phonemizer **!**NOTE**!**: use of [`phonemizer`](https://github.com/bootphon/phonemizer) requires `espeak-ng` installed, or an equivalent backend. Any errors thrown from it are an issue with `phonemizer` itself.