Missing dataset: whisper.json. #279
Labels
No Label
bug
duplicate
enhancement
help wanted
insufficient info
invalid
news
not a bug
question
wontfix
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: mrq/ai-voice-cloning#279
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I have read the documentation multiple times, even other sections, but i have no idea from where is whisper.json supposed to come from.
https://git.ecker.tech/mrq/ai-voice-cloning/wiki/Training
https://git.ecker.tech/mrq/ai-voice-cloning/wiki/Generate
Nothing about creating a whisper.json file.
I did as instructed and put a wav file in a folder under the
voices
folder.Flonne is the folder where the wav file is and i use to generate, just like the training section says.
That same audio file is used to train according to the training section, however, when i press transcribe and process i get the error of the image.
How do i generate this whisper.json file?
Something may have gone wrong with the transcription, please post your console log.
Here is the image, is this what you were asking for?
Extra information to replicate my issue:
The voice, Flonne, was obtained from the link of the mega on the wiki:
https://git.ecker.tech/mrq/ai-voice-cloning/wiki/Collecting-Samples
In the mega folder, the Flonne one.
I converted it to
wav
with ffmpeg because that's what the documentation mentioned and put it in the voice folder, in a folder named "Flonne".System:
OS: Windows 10.
GPU: RTX 3060 12gb.
Python: 3.10.
Nvidia drivers: Latest.
It's the second-to-last message that indicates the root of the problem. It can't find the file, so it can't transcribe anything, so the whisper.json never gets made.
Thank you for answer.
Why would it not find it?
Please take a look at this image, the file is there as you can see in the folder path, what could the problem be?
Please run
ffprobe
on the file and post the results.Sure.
Is this good enough or want me to use another parameter?
I got it by doing this in ffmpeg:
ffmpeg -i Flonne.mp3 Flonne.wav
Might i ask, does the slash matter? I noticed it said in the second image i posted
./voices/Flonne\Flonne.wav
Windows uses \ but the ui uses /.
I wonder if that matters.
Hmm, it's a valid .wav file... it should be able to convert it from there. Can you try running
whisperx
on (or justwhisper
if that's what you have installed) and see if it throws an error?I had no luck installing whisperx, not even sure if i installed the module correctly.
What i did was install it via
pip install git+https://github.com/m-bain/whisperx.git --upgrade
followed by what they said in git:$ git clone https://github.com/m-bain/whisperX.git
$ cd whisperX
$ pip install -e .
This repo was cloned in the location
\ai-voice-cloning\modules
.To install the normal whisper what i did was what the github repo said:
pip install git+https://github.com/openai/whisper.git
. That worked, in the screenshot you can see it says it loaded whisper.I also got another voice and added it to the voices folder, tested the training but still errors (as you can see in this screenshot, "Melina" is the newer voice). I can generate, but when try to train it suddenly cannot find/recognize the wav file, it's very weird.
I noticed something weird though:
When generating it searched on
./voices\Melina\cond_latents_d1f79232.pth
When training it searched on
./voices/Melina\Melina.wav
, not on./voices\Melina\Melina.wav
.I wonder if that matters.
Here is a screenshot of how it looks.
I also have this issue. Exactly as Atoli describes above. I see the same mix of / and \ slashes. I just started with ai-voice-cloning yesterday. I removed and re-installed - did not see any errors. Everything else seems to work.
Try running
whisper
from the command line on the .wav file and see if it works.Ok, i tried
whisper Flonne.wav
where Flonne wav file was.It "generated" something that was 400mb (i don't know what) and then i got an error.
The screenshot shows all the details.
"[WinError 2] The system cannot find the file specified" is the same error as before, so it looks like the problem is not related to the cloning software but more likely something to do with your python install.
Weird.
Do you have any suggestion on how to possibly fix it? I uninstalled python 3.10 and installed python 3.9 but the same error happens.
Humanzoo also said he/she is having the exact same problem as me so it seems to be a new issue affecting multiple users.
Also, i have a question, the documentation says ffmpeg is needed for training, however, it never says where to put ffmpeg after download it.
So i have the exe but nowhere to put them.
I found whisper repo says this error might be related to the system not finding ffmpeg:
https://github.com/openai/whisper/discussions/109
Could this be the problem?
I believe you need to install it via pip so it can be used in the python virtual environment. Could be the root cause of your problem.
Greetings.
I found out the issue: ffmpeg was not set in the PATH so whisper could not find it. I guess the error was not that it couldn't find
Flonne.wav
but that it could not findffmpeg.exe
.humanzoo, the problem is that need to download
ffmpeg
and put it on the environment PATH.Here is how to do so: https://linuxhint.com/add-directory-to-path-environment-variables-windows/
That worked! Thank you Atoli. If you are on Windows you must reboot after changing the path. At least I had to.