An unofficial PyTorch implementation of VALL-E

audio-lm pytorch text-to-speech tts vall-e valle

Go to file

mrq 5a09a5f6e9 I forgot about the time embedding...		2024-11-08 22:46:26 -06:00
data	adjusted how i want to pass eval kwargs	2024-10-25 20:38:09 -05:00
docs	'borrowed' a sampling scheduler for NAR-len's RVQ level 0 (better than before, but still not good enough)	2024-11-07 21:19:14 -06:00
scripts	support for wildcard in training/validation/noise dataset array (to-do: a better way to query between metadata folder and data folder)	2024-09-18 21:34:43 -05:00
vall_e	I forgot about the time embedding...	2024-11-08 22:46:26 -06:00
.gitignore	better way to compute per-segment losses	2024-05-28 19:29:54 -05:00
LICENSE	Rewrite init	2023-08-02 21:53:35 +00:00
README.md	agony	2024-11-05 22:30:49 -06:00
setup.py	ugh	2024-11-05 11:50:05 -06:00
vall-e.png	Rewrite init	2023-08-02 21:53:35 +00:00

README.md

VALL'E

An unofficial PyTorch implementation of VALL-E, utilizing the EnCodec encoder/decoder.

A demo is available on HuggingFace here.

Requirements

Besides a working PyTorch environment, the only hard requirement is espeak-ng for phonemizing text:

Linux users can consult their package managers on installing espeak/espeak-ng.
Windows users are required to install espeak-ng.
- additionally, you may be required to set the PHONEMIZER_ESPEAK_LIBRARY environment variable to specify the path to libespeak-ng.dll.
In the future, an internal homebrew to replace this would be fantastic.

Install

Simply run pip install git+https://git.ecker.tech/mrq/vall-e or pip install git+https://github.com/e-c-k-e-r/vall-e.

I've tested this repo under Python versions 3.10.9, 3.11.3, and 3.12.3.

Pre-Trained Model

My pre-trained weights can be acquired from here.

A script to setup a proper environment and download the weights can be invoked with ./scripts/setup.sh. This will automatically create a venv, and download the ar+nar-llama-8 weights and config file to the right place.

When inferencing, either through the web UI or CLI, if no model is passed, the default model will download automatically instead, and should automatically update.

Documentation

The provided documentation under ./docs/ should provide thorough coverage over most, if not all, of this project.

Markdown files should correspond directly to their respective file or folder under ./vall_e/.