Starting the vall-e backend crashes #337
Labels
No Label
bug
duplicate
enhancement
help wanted
insufficient info
invalid
news
not a bug
question
wontfix
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: mrq/ai-voice-cloning#337
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
` ./start.sh --tts-backend="vall-e"
Whisper detected
Traceback (most recent call last):
File "/home/user/ai-voice-cloning/src/utils.py", line 88, in
from vall_e.inference import TTS as VALLE_TTS
File "/home/user/ai-voice-cloning/modules/vall-e/vall_e/inference.py", line 15, in
from .train import load_engines
File "/home/user/ai-voice-cloning/modules/vall-e/vall_e/train.py", line 4, in
from .data import create_train_val_dataloader
File "/home/user/ai-voice-cloning/modules/vall-e/vall_e/data.py", line 597, in
@cfg.diskcache()
File "/usr/lib/python3.10/functools.py", line 981, in get
val = self.func(instance)
File "/home/user/ai-voice-cloning/modules/vall-e/vall_e/config.py", line 460, in diskcache
return diskcache.Cache(self.cache_dir).memoize
File "/home/user/ai-voice-cloning/modules/vall-e/vall_e/config.py", line 455, in cache_dir
return ".cache" / self.relpath
File "/home/user/ai-voice-cloning/modules/vall-e/vall_e/config.py", line 28, in relpath
return Path(self.cfg_path)
File "/usr/lib/python3.10/pathlib.py", line 960, in new
self = cls._from_parts(args)
File "/usr/lib/python3.10/pathlib.py", line 594, in _from_parts
drv, root, parts = self._parse_args(args)
File "/usr/lib/python3.10/pathlib.py", line 578, in _parse_args
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType
Traceback (most recent call last):
File "/home/user/ai-voice-cloning/src/utils.py", line 105, in
import bark
ModuleNotFoundError: No module named 'bark'
Traceback (most recent call last):
File "/home/user/ai-voice-cloning/./src/main.py", line 24, in
webui = setup_gradio()
File "/home/user/ai-voice-cloning/src/webui.py", line 663, in setup_gradio
EXEC_SETTINGS['valle_model'] = gr.Dropdown(choices=valle_models, label="VALL-E Model Config", value=args.valle_model if args.valle_model else valle_models[0])
IndexError: list index out of range`
Since I still have yet to get around to working on the web UI to update the integration for normal people, the web UI expects the a model to be present under
./training/
. For example:./training/valle/
which containsconfig.yaml
andckpt
.Okay, I placed the valle folder. Correct me if I did anything wrong because I still got errors. It's currently placed like this
/ai-voice-cloning/training/valle/ckpt/ar-retnet-4/fp32.pth
/ai-voice-cloning/training/valle/ckpt/nar-retnet-4/fp32.pth
/ai-voice-cloning/training/valle/config.yaml
./start.sh --tts-backend="vall-e"
Whisper detected
Traceback (most recent call last):
File "/home/user/ai-voice-cloning/src/utils.py", line 88, in <module>
from vall_e.inference import TTS as VALLE_TTS
File "/home/user/ai-voice-cloning/modules/vall-e/vall_e/inference.py", line 15, in <module>
from .train import load_engines
File "/home/user/ai-voice-cloning/modules/vall-e/vall_e/train.py", line 4, in <module>
from .data import create_train_val_dataloader
File "/home/user/ai-voice-cloning/modules/vall-e/vall_e/data.py", line 597, in <module>
@cfg.diskcache()
File "/usr/lib/python3.10/functools.py", line 981, in __get__
val = self.func(instance)
File "/home/user/ai-voice-cloning/modules/vall-e/vall_e/config.py", line 460, in diskcache
return diskcache.Cache(self.cache_dir).memoize
File "/home/user/ai-voice-cloning/modules/vall-e/vall_e/config.py", line 455, in cache_dir
return ".cache" / self.relpath
File "/home/user/ai-voice-cloning/modules/vall-e/vall_e/config.py", line 28, in relpath
return Path(self.cfg_path)
File "/usr/lib/python3.10/pathlib.py", line 960, in __new__
self = cls._from_parts(args)
File "/usr/lib/python3.10/pathlib.py", line 594, in _from_parts
drv, root, parts = self._parse_args(args)
File "/usr/lib/python3.10/pathlib.py", line 578, in _parse_args
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType
Traceback (most recent call last):
File "/home/user/ai-voice-cloning/src/utils.py", line 105, in <module>
import bark
ModuleNotFoundError: No module named 'bark'
Running on local URL: http://XXX.X.X.X:XXXX
To create a public link, set
share=Truein
launch().
Loading VALL-E... (Config: None)
Traceback (most recent call last):
File "/home/user/ai-voice-cloning/./src/main.py", line 27, in <module>
tts = load_tts()
File "/home/user/ai-voice-cloning/src/utils.py", line 3629, in load_tts
tts = VALLE_TTS(config=args.valle_model)
NameError: name 'VALLE_TTS' is not defined
Right. I forgot to try and figure out an elegant solution to that during my seldom inference tests using the web UI.Use./start.sh --tts-backend="vall-e" yaml="./training/valle/config.yaml"
.Remedied in mrq/vall-e commit
d1065984
.Okay, that did it! Thank you for the quick response!
Using this command works
./start.sh --tts-backend="vall-e" yaml="./training/valle/config.yaml"
But using this command brings up an error
./start.sh --tts-backend="vall-e"
`Whisper`` detected
[2023-08-23 09:08:37,714] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2023-08-23 09:08:40,229] [INFO] [comm.py:631:init_distributed] cdb=None
[2023-08-23 09:08:40,229] [INFO] [comm.py:662:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
VALL-E detected
Traceback (most recent call last):
File "/home/user/ai-voice-cloning/src/utils.py", line 105, in
import bark
ModuleNotFoundError: No module named 'bark'
Running on local URL: http://XXX.X.X.X:XXXX
To create a public link, set
share=True
inlaunch()
.Loading VALL-E... (Config: None)
Traceback (most recent call last):
File "/home/user/ai-voice-cloning/./src/main.py", line 27, in
tts = load_tts()
File "/home/user/ai-voice-cloning/src/utils.py", line 3629, in load_tts
tts = VALLE_TTS(config=args.valle_model)
File "/home/user/ai-voice-cloning/modules/vall-e/vall_e/inference.py", line 55, in init
self.load_models()
File "/home/user/ai-voice-cloning/modules/vall-e/vall_e/inference.py", line 64, in load_models
engines = load_engines()
File "/home/user/ai-voice-cloning/modules/vall-e/vall_e/utils/trainer.py", line 64, in load_engines
models = get_models(cfg.models.get())
TypeError: Models.get() missing 1 required positional argument: 'self'`
Having the same issue as the last one on the default launch through
start.bat --tts-backend="vall-e"
And having
NameError: name 'VALLE_TTS' is not defined
when launching withyaml="./training/valle/config.yaml"
Config file, 2 pth and data.h5 are set in the training valle folder.
I assume it is easier to launch vall-e through it's own separated fork rather web ui?
Ok, I solved the issue and was able to launch it on windows. What is written bellow are my steps till finally generating something with vall-e through webUI. I am not a coder so consider that I just tried to fix my current errors without thinking how it will break other code and functions. This was quite CBT and I hope in the future it will be as easy to set up as tortoise.
First of all I launch it as reccomended in this thread through
start.bat --tts-backend="vall-e" yaml="./training/valle/config.yaml"
Only that way it launches with more or less reasonable errors.
Then I had already known error
NameError: name 'VALLE_TTS' is not defined
and even tho run crashed once on it I tried removing it out of try/except statement in the beginning of the untils.py file. Eventually it started to work both ways with try/except or without.The next error was the one that I saw together with previous one
ModuleNotFoundError: No module named 'deepspeed'
. So I went to check themodules\vall-e\vall_e\engines\__init__.py
and noticed the check for the engine choice. I remembered that deepspeed doesn't work on windows or it works, but I definitely not having it pre-installed with webUI so I just commented the whole if/elif statement and just pastedfrom .base import Engine
.After this I got the error
So I had no idea what is it, but set_lr made me thing that it is related to config file. So instead of using config.yaml from hugging face I placed one that was pre-installed with vall-e (I have no idea why are they different). And it went further.
I get the
FileNotFoundError: [Errno 2] No such file or directory: 'training\\valle\\ckpt\\ar-retnet-4\\fp32.pth'
. I remember that I downloaded everything partially from hugging face without folders. And this time I recreated folders hierarchy from hugging face here in my local valle folder.So now I launch webUI and get this on a generation
RuntimeError: espeak not installed on your system
. I installed it on my system from https://github.com/espeak-ng/espeak-ng/releases . Still didn't work. I added it to the PATH variable and also created 2 system variables myself "PHONEMIZER_ESPEAK_LIBRARY" with the direct path to libespeak-ng.dll, and "PHONEMIZER_ESPEAK_PATH" to the whole folder so the paths looked like "C:\Program Files\eSpeak NG\libespeak-ng.dll" and "C:\Program Files\eSpeak NG". Still didn't work! So I just went to the.\venv\Lib\site-packages\phonemizer\backend\espeak\base.py
and added in the beginning where all the imports are:Either way hope this will help someone to at least launch vall-e on windows in webUI. And wish mrq to generate a good base model! I still not sure if I can ever possibly train such complicated model myself, but I hope in the near future to see extra setting and new base model, I use tts to copy voices of a game characters which don't work on base models except fine tuned tortoise for specific voices. I haven't read the paper in details, but I remember vall-e takes only 3 seconds of a sample or smth with is very small imo. And also hope to see UI setting to pick between AR, NAR and both even if I don't know how it works yet.