"Unsupported audio format provided: .pth" #466

Open
opened 2024-01-08 21:49:41 +00:00 by Atoli · 1 comment

I am using models trained with this very same UI.
The training is done and all, but then when i try to use it (copy paste the result to \ai-voice-cloning\voices) i get the following error:

  File "C:\Users\PC\Desktop\ai-voice-cloning\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "C:\Users\PC\Desktop\ai-voice-cloning\venv\lib\site-packages\gradio\helpers.py", line 587, in tracked_fn
    response = fn(*args)
  File "C:\Users\PC\Desktop\ai-voice-cloning\src\webui.py", line 94, in generate_proxy
    raise e
  File "C:\Users\PC\Desktop\ai-voice-cloning\src\webui.py", line 88, in generate_proxy
    sample, outputs, stats = generate(**kwargs)
  File "C:\Users\PC\Desktop\ai-voice-cloning\src\utils.py", line 363, in generate
    return generate_tortoise(**kwargs)
  File "C:\Users\PC\Desktop\ai-voice-cloning\src\utils.py", line 1222, in generate_tortoise
    settings = get_settings( override=override )
  File "C:\Users\PC\Desktop\ai-voice-cloning\src\utils.py", line 1084, in get_settings
    settings['voice_samples'], settings['conditioning_latents'], _ = fetch_voice(voice=selected_voice)
  File "C:\Users\PC\Desktop\ai-voice-cloning\src\utils.py", line 1017, in fetch_voice
    voice_samples, conditioning_latents = load_voice(voice, model_hash=tts.autoregressive_model_hash)
  File "C:\Users\PC\Desktop\ai-voice-cloning\modules\tortoise-tts\tortoise\utils\audio.py", line 189, in load_voice
    c = load_audio(path, sample_rate)
  File "C:\Users\PC\Desktop\ai-voice-cloning\modules\tortoise-tts\tortoise\utils\audio.py", line 32, in load_audio
    assert False, f"Unsupported audio format provided: {audiopath[-4:]}"
AssertionError: Unsupported audio format provided: .pth

What could the problem be?

I am using models trained with this very same UI. The training is done and all, but then when i try to use it (copy paste the result to \ai-voice-cloning\voices) i get the following error: ``` File "C:\Users\PC\Desktop\ai-voice-cloning\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 851, in run result = context.run(func, *args) File "C:\Users\PC\Desktop\ai-voice-cloning\venv\lib\site-packages\gradio\helpers.py", line 587, in tracked_fn response = fn(*args) File "C:\Users\PC\Desktop\ai-voice-cloning\src\webui.py", line 94, in generate_proxy raise e File "C:\Users\PC\Desktop\ai-voice-cloning\src\webui.py", line 88, in generate_proxy sample, outputs, stats = generate(**kwargs) File "C:\Users\PC\Desktop\ai-voice-cloning\src\utils.py", line 363, in generate return generate_tortoise(**kwargs) File "C:\Users\PC\Desktop\ai-voice-cloning\src\utils.py", line 1222, in generate_tortoise settings = get_settings( override=override ) File "C:\Users\PC\Desktop\ai-voice-cloning\src\utils.py", line 1084, in get_settings settings['voice_samples'], settings['conditioning_latents'], _ = fetch_voice(voice=selected_voice) File "C:\Users\PC\Desktop\ai-voice-cloning\src\utils.py", line 1017, in fetch_voice voice_samples, conditioning_latents = load_voice(voice, model_hash=tts.autoregressive_model_hash) File "C:\Users\PC\Desktop\ai-voice-cloning\modules\tortoise-tts\tortoise\utils\audio.py", line 189, in load_voice c = load_audio(path, sample_rate) File "C:\Users\PC\Desktop\ai-voice-cloning\modules\tortoise-tts\tortoise\utils\audio.py", line 32, in load_audio assert False, f"Unsupported audio format provided: {audiopath[-4:]}" AssertionError: Unsupported audio format provided: .pth ``` What could the problem be?

@Atoli
Your .pth files should stay in the "models" folder.

For instance \ai-voice-cloning\training\white_female3a\finetune\models
It is looking for .wav files in the voices folder.

To use the fine-tuned .pth file, you need to go into your Web-based UI, go to the Settings and selected the .pth file under "Auto-regressive Model".

Mine says something like: "./training/white_female2b/finetune/models/241_gpt.pth"

That's how you use trained voices.

FYI - I'm convinced that mrq was picked up by some AI technology company to assist with their AI voice cloning product.... or prevent him from maturing this. Good for him!

@Atoli Your .pth files should stay in the "models" folder. For instance \ai-voice-cloning\training\white_female3a\finetune\models It is looking for .wav files in the voices folder. To use the fine-tuned .pth file, you need to go into your Web-based UI, go to the Settings and selected the .pth file under "Auto-regressive Model". Mine says something like: "./training/white_female2b/finetune/models/241_gpt.pth" That's how you use trained voices. FYI - I'm convinced that mrq was picked up by some AI technology company to assist with their AI voice cloning product.... or prevent him from maturing this. Good for him!
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: mrq/ai-voice-cloning#466
No description provided.