Below are some samples from my VALL-E implementation: https://git.ecker.tech/mrq/vall-e/. I do not consider these to be state of the art. Below are samples from LibriSpeech, comparing against the samples the original VALL-E demo sampled.
Text | Prompt | Ground Truth | Our VALL-E | Original VALL-E | YourTTS |
---|
Below are some extra samples.
Text | Prompt | Ground Truth | Our VALL-E |
---|