vall-e/README.md

<p align="center">
<img src="./vall-e.png" width="500px"></img>
</p>

# VALL'E

An unofficial PyTorch implementation of [VALL-E](https://vall-e-demo.ecker.tech/), utilizing the [EnCodec](https://github.com/facebookresearch/encodec) encoder/decoder.

A demo is available on HuggingFace [here](https://huggingface.co/spaces/ecker/vall-e).

## Requirements

Besides a working PyTorch environment, the only hard requirement is [`espeak-ng`](https://github.com/espeak-ng/espeak-ng/) for phonemizing text:
- Linux users can consult their package managers on installing `espeak`/`espeak-ng`.
- Windows users are required to install [`espeak-ng`](https://github.com/espeak-ng/espeak-ng/releases/tag/1.51#Assets).
  + additionally, you may be required to set the `PHONEMIZER_ESPEAK_LIBRARY` environment variable to specify the path to `libespeak-ng.dll`.
- In the future, an internal homebrew to replace this would be fantastic.

## Install

Simply run `pip install git+https://git.ecker.tech/mrq/vall-e` or `pip install git+https://github.com/e-c-k-e-r/vall-e`.

I've tested this repo under Python versions `3.10.9`, `3.11.3`, and `3.12.3`.

## Pre-Trained Model

My pre-trained weights can be acquired from [here](https://huggingface.co/ecker/vall-e).

A script to setup a proper environment and download the weights can be invoked with `./scripts/setup.sh`. This will automatically create a `venv`, and download the `ar+nar-llama-8` weights and config file to the right place.

When inferencing, either through the web UI or CLI, if no model is passed, the default model will download automatically instead, and should automatically update.

## Documentation

The provided documentation under [./docs/](./docs/) should provide thorough coverage over most, if not all, of this project.

Markdown files should correspond directly to their respective file or folder under `./vall_e/`.
Rewrite init 2023-08-02 21:53:35 +00:00			`<p align="center">`
			`<img src="./vall-e.png" width="500px"></img>`
			`</p>`

god I am inexperienced with retaining compat from previous weights, I hope no one actually has weights 2023-08-19 02:29:20 +00:00			`# VALL'E`
Rewrite init 2023-08-02 21:53:35 +00:00
added demo link to readme 2024-07-20 02:22:30 +00:00			`An unofficial PyTorch implementation of [VALL-E](https://vall-e-demo.ecker.tech/), utilizing the [EnCodec](https://github.com/facebookresearch/encodec) encoder/decoder.`
fixed training stats not loading from exported weights, a bit of a readme cleanup, updated example training yaml 2023-09-24 00:59:00 +00:00
agony 2024-11-06 04:30:49 +00:00			`A demo is available on HuggingFace [here](https://huggingface.co/spaces/ecker/vall-e).`

god I am inexperienced with retaining compat from previous weights, I hope no one actually has weights 2023-08-19 02:29:20 +00:00			`## Requirements`
Rewrite init 2023-08-02 21:53:35 +00:00
readme tweaks 2024-06-29 02:02:54 +00:00			Besides a working PyTorch environment, the only hard requirement is [`espeak-ng`](https://github.com/espeak-ng/espeak-ng/) for phonemizing text:
finally got around to removing omegaconf 2024-06-08 01:23:53 +00:00			- Linux users can consult their package managers on installing `espeak`/`espeak-ng`.
			- Windows users are required to install [`espeak-ng`](https://github.com/espeak-ng/espeak-ng/releases/tag/1.51#Assets).
			+ additionally, you may be required to set the `PHONEMIZER_ESPEAK_LIBRARY` environment variable to specify the path to `libespeak-ng.dll`.
readme tweaks, set the (unused) default model download URL back to the base ar+nar-llama-8 model, as ar+nar-tts+stt-llama-8 was renamed back to it since it performs well 2024-10-06 03:53:53 +00:00			`- In the future, an internal homebrew to replace this would be fantastic.`
Rewrite init 2023-08-02 21:53:35 +00:00
god I am inexperienced with retaining compat from previous weights, I hope no one actually has weights 2023-08-19 02:29:20 +00:00			`## Install`
Rewrite init 2023-08-02 21:53:35 +00:00
deprecate sole AR/NAR model by only keeping the AR+NAR (the beauty of no one using this is that I can break compat as much as I want), add tone token for when I classify my dataset with tone/emotion in the future, some other things 2024-04-16 00:54:32 +00:00			Simply run `pip install git+https://git.ecker.tech/mrq/vall-e` or `pip install git+https://github.com/e-c-k-e-r/vall-e`.
Rewrite init 2023-08-02 21:53:35 +00:00
finally got around to removing omegaconf 2024-06-08 01:23:53 +00:00			I've tested this repo under Python versions `3.10.9`, `3.11.3`, and `3.12.3`.
ops 2023-08-04 01:36:19 +00:00
added repo with my weights so far 2023-08-22 18:09:44 +00:00			`## Pre-Trained Model`

			`My pre-trained weights can be acquired from [here](https://huggingface.co/ecker/vall-e).`

documentation update 2024-08-04 05:14:49 +00:00			A script to setup a proper environment and download the weights can be invoked with `./scripts/setup.sh`. This will automatically create a `venv`, and download the `ar+nar-llama-8` weights and config file to the right place.
updated readme to reflect changes 2024-10-26 03:17:05 +00:00
			`When inferencing, either through the web UI or CLI, if no model is passed, the default model will download automatically instead, and should automatically update.`
added repo with my weights so far 2023-08-22 18:09:44 +00:00
documentation under ./docs/ 2024-11-05 22:11:01 +00:00			`## Documentation`
Rewrite init 2023-08-02 21:53:35 +00:00
documentation under ./docs/ 2024-11-05 22:11:01 +00:00			`The provided documentation under [./docs/](./docs/) should provide thorough coverage over most, if not all, of this project.`
Rewrite init 2023-08-02 21:53:35 +00:00
documentation under ./docs/ 2024-11-05 22:11:01 +00:00			Markdown files should correspond directly to their respective file or folder under `./vall_e/`.