# TorToiSe TTS An unofficial PyTorch re-implementation of [TorToise TTS](https://github.com/neonbjb/tortoise-tts/tree/98a891e66e7a1f11a830f31bd1ce06cc1f6a88af). ## Requirements A working PyTorch environment. ## Install Simply run `pip install git+https://git.ecker.tech/mrq/tortoise-tts` or `pip install git+https://github.com/e-c-k-e-r/tortoise-tts`. ## To-Do - [X] Reimplement original inferencing through TorToiSe (as done with `api.py`) - [X] Implement training support (without DLAS) - [X] Feature parity with the VALL-E training setup with preparing a dataset ahead of time - [ ] Automagic handling of the original weights into compatible weights - [ ] Extend the original inference routine with additional features: - [x] non-float32 / mixed precision - [x] BitsAndBytes support - [x] LoRAs - [x] Web UI - [ ] Feature parity with [ai-voice-cloning](https://git.ecker.tech/mrq/ai-voice-cloning) - [ ] Additional samplers for the autoregressive model - [ ] Additional samplers for the diffusion model - [ ] BigVGAN in place of the original vocoder - [ ] XFormers / flash_attention_2 for the autoregressive model - [ ] Some vector embedding store to find the "best" utterance to pick ## Why? To correct the mess I've made with forking TorToiSe TTS originally with a bunch of slopcode, and the nightmare that ai-voice-cloning turned out. Additional features can be applied to the program through a framework of my own that I'm very familiar with. ## License Unless otherwise credited/noted in this README or within the designated Python file, this repository is [licensed](LICENSE) under AGPLv3.