"System error" when using Bark as TTS #306
Labels
No Label
bug
duplicate
enhancement
help wanted
insufficient info
invalid
news
not a bug
question
wontfix
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: mrq/ai-voice-cloning#306
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
When trying to run the Bark TTS mode, I always get this error, no matter what audio files I use:
`D:\voice\ai-voice-cloning>call .\venv\Scripts\activate.bat
Whisper detected
Error: No module named 'vall_e'
Bark detected
Vocos detected
Error: No module named 'hubert'
Running on local URL: http://127.0.0.1:7860
To create a public link, set
share=True
inlaunch()
.Loading Bark...
Loaded TTS, ready for generation.
[1/1] Generating line: Test
Using as reference: ./training/tboiNarrator/audio/see 4ever 1_00000.wav I can see forever.
Traceback (most recent call last):
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\gradio\routes.py", line 394, in run_predict
output = await app.get_blocks().process_api(
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\gradio\blocks.py", line 1075, in process_api
result = await self.call_function(
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\gradio\blocks.py", line 884, in call_function
prediction = await anyio.to_thread.run_sync(
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\gradio\helpers.py", line 587, in tracked_fn
response = fn(*args)
File "D:\voice\ai-voice-cloning\src\webui.py", line 94, in generate_proxy
raise e
File "D:\voice\ai-voice-cloning\src\webui.py", line 88, in generate_proxy
sample, outputs, stats = generate(**kwargs)
File "D:\voice\ai-voice-cloning\src\utils.py", line 323, in generate
return generate_bark(**kwargs)
File "D:\voice\ai-voice-cloning\src\utils.py", line 465, in generate_bark
gen = tts.inference(cut_text, **settings )
File "D:\voice\ai-voice-cloning\src\utils.py", line 263, in inference
self.create_voice( voice )
File "D:\voice\ai-voice-cloning\src\utils.py", line 212, in create_voice
wav, sr = torchaudio.load(audio_filepath)
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\torchaudio\backend\soundfile_backend.py", line 221, in load
with soundfile.SoundFile(filepath, "r") as file_:
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\soundfile.py", line 658, in init
self._file = self._open(file, mode_int, closefd)
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\soundfile.py", line 1216, in _open
raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name))
soundfile.LibsndfileError: Error opening './training/tboiNarrator/audio/see 4ever 1_00000.wav': System error.`
I'll go on a limb and say I forgot to document that you need to first transcribe your voice files under
Training > Prepare Dataset
, since Bark requires a transcription of the reference audio it's being fed. Make sure it also keeps the audio it copies under./training/tboiNarrator/audio/
. You may or may not need to do this while having--tts-backend="tortoise"
, but I don't think there's any backend exclusive code for it.However, I'm not sure you'll get much luck with voice cloning under Bark. While using the
random
option seems to work fine recently, I still couldn't figure out how to get it to clone semi-competently.Means it couldn't find the file (or it doesn't have permissions to open it). Examine the filenames in your
train.txt
and check to make sure the files exist in the locations specified.They are transscribed using the way you described and in the right folders as far as I'm aware
Ah, I see now. Oops.
It's rather silly to have to do, but you'll need to go into
Training > Prepare Dataset
and click(Re)Slice Audio
.If I remember how I implemented it, the routine will read the
whisper.json
(rather than thetrain.txt
) to find transcriptions and, since it checks each segment rather than each whole file, it expects the sliced audio to exist (even if it doesn't need to be sliced).Whenever I get a chance, I'll rework it to instead check
train.txt
first and leverage it, since that should always work.I lied, I didn't even need to that, as I was extremely naively assuming things were sliced.
Fixed in commit
e2a6dc1c0a
. You shouldn't need to slice your audio beforehand.Ah, that did the trick. Thank you
But now I get a new error that says "
[1/1] Generating line: Hello
Using as reference: ./training/tboiNarrator/audio/speed down 2_00000.wav Speed down.
Traceback (most recent call last):
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\gradio\routes.py", line 394, in run_predict
output = await app.get_blocks().process_api(
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\gradio\blocks.py", line 1075, in process_api
result = await self.call_function(
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\gradio\blocks.py", line 884, in call_function
prediction = await anyio.to_thread.run_sync(
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\gradio\helpers.py", line 587, in tracked_fn
response = fn(*args)
File "D:\voice\ai-voice-cloning\src\webui.py", line 94, in generate_proxy
raise e
File "D:\voice\ai-voice-cloning\src\webui.py", line 88, in generate_proxy
sample, outputs, stats = generate(**kwargs)
File "D:\voice\ai-voice-cloning\src\utils.py", line 323, in generate
return generate_bark(**kwargs)
File "D:\voice\ai-voice-cloning\src\utils.py", line 465, in generate_bark
gen = tts.inference(cut_text, **settings )
File "D:\voice\ai-voice-cloning\src\utils.py", line 268, in inference
semantic_tokens = text_to_semantic(text, history_prompt=voice, temp=text_temp, silent=False)
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\bark\api.py", line 25, in text_to_semantic
x_semantic = generate_text_semantic(
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\bark\generation.py", line 394, in generate_text_semantic
history_prompt = _load_history_prompt(history_prompt)
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\bark\generation.py", line 364, in _load_history_prompt
history_prompt = np.load(
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\numpy\lib\npyio.py", line 405, in load
fid = stack.enter_context(open(os_fspath(file), "rb"))
FileNotFoundError: [Errno 2] No such file or directory: 'D:\voice\ai-voice-cloning\venv\lib\site-packages\bark\assets\prompts\tboiNarrator.npz'"
And using a random voice:
"[1/1] Generating line: Hello
Generating line took 1.8284106254577637 seconds
Traceback (most recent call last):
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\gradio\routes.py", line 394, in run_predict
output = await app.get_blocks().process_api(
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\gradio\blocks.py", line 1075, in process_api
result = await self.call_function(
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\gradio\blocks.py", line 884, in call_function
prediction = await anyio.to_thread.run_sync(
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "D:\voice\ai-voice-cloning\venv\lib\site-packages\gradio\helpers.py", line 587, in tracked_fn
response = fn(*args)
File "D:\voice\ai-voice-cloning\src\webui.py", line 94, in generate_proxy
raise e
File "D:\voice\ai-voice-cloning\src\webui.py", line 88, in generate_proxy
sample, outputs, stats = generate(**kwargs)
File "D:\voice\ai-voice-cloning\src\utils.py", line 323, in generate
return generate_bark(**kwargs)
File "D:\voice\ai-voice-cloning\src\utils.py", line 528, in generate_bark
audio_cache[name]['output'] = True
KeyError: '00002_1'
"
The first error's due to some jank in requiring Bark to be installed under
./modules/bark/
. I fixed this in commitac645e0a20
.As for the second one, I'm not too sure, but I also added a possible fix for it. I'm not sure what's causing it, since I haven't hit that issue once in testing. I'll see if I can trigger it when I get a chance.