diff --git a/examples/finetuned/lj/1.mp3 b/examples/finetuned/lj/1.mp3 index eec1e15..5eb9bac 100644 Binary files a/examples/finetuned/lj/1.mp3 and b/examples/finetuned/lj/1.mp3 differ diff --git a/examples/finetuned/lj/2.mp3 b/examples/finetuned/lj/2.mp3 index 5eb9bac..eec1e15 100644 Binary files a/examples/finetuned/lj/2.mp3 and b/examples/finetuned/lj/2.mp3 differ diff --git a/tortoise_v2_examples.html b/tortoise_v2_examples.html index 5702ed3..51fcf34 100644 --- a/tortoise_v2_examples.html +++ b/tortoise_v2_examples.html @@ -36,13 +36,20 @@ available at https://github.co

LJSpeech is a popular dataset used to train small-scale TTS models. TorToiSe is a multi-voice model, following is how it renders the LJSpeech voice with no fine-tuning, compared with results for the same text from the popular Tacotron2 model paired with the Waveglow transformer:

-
Tacotron2+WaveglowTorToiSe

-

-

-

-

-

-
+ + + + + + + + + + + +
Tacotron2+WaveglowTorToiSeTorToiSe Finetuned

+






+


All Results 🐢

Following are all the results from which the hand-picked results were drawn from. Also included is the reference