tigi6346
  • Joined on Mar 11, 2023

tigi6346 commented on issue mrq/ai-voice-cloning#113

Generated voices from training data always garbled.... but works fine using tortoise-tts-fast ... (?)

I can't imagine 0s files being anything other than poorly cut off. If you have enough data then I'd drop the worst part of it.

2023-03-11 04:23:23 +07:00

tigi6346 commented on pull request mrq/ai-voice-cloning#112

master

> You might have botched your venv somehow with it having Windows line endings (CRLF). I sure did.

2023-03-11 03:53:12 +07:00

tigi6346 commented on issue mrq/ai-voice-cloning#113

Generated voices from training data always garbled.... but works fine using tortoise-tts-fast ... (?)

I see https://github.com/m-bain/whisperX/blob/main/whisperx/transcribe.py has no_speech_threshold: Optional[float] = 0.6, no_speech_threshold: float If the no_speech probability is higher…

2023-03-11 03:48:07 +07:00

tigi6346 commented on issue mrq/ai-voice-cloning#113

Generated voices from training data always garbled.... but works fine using tortoise-tts-fast ... (?)

I don't know if its good or not, just brainstorming. This will be awful for long files. Although theoretically, if whisper can play nice with what it is given, then another tool can do the cutting.

2023-03-11 03:33:58 +07:00

tigi6346 commented on issue mrq/ai-voice-cloning#113

Generated voices from training data always garbled.... but works fine using tortoise-tts-fast ... (?)

Can we make the cuts for whisper? Say I take my 1 minute source, and cut it into sentences. Can whisper then take each sentence and not cut it further? That's what comes to mind when thinking how…

2023-03-11 03:30:15 +07:00

tigi6346 reopened pull request mrq/ai-voice-cloning#112

master

2023-03-11 03:23:48 +07:00

tigi6346 closed pull request mrq/ai-voice-cloning#112

master

2023-03-11 03:17:06 +07:00

tigi6346 commented on issue mrq/ai-voice-cloning#113

Generated voices from training data always garbled.... but works fine using tortoise-tts-fast ... (?)

The thing is I have very little to complain about whisperx transcribe process. It is the splitting the audio up into chunks that is sus. I've seen on https://github.com/m-bain/whisperX that they…

2023-03-11 03:09:38 +07:00

tigi6346 commented on issue mrq/ai-voice-cloning#113

Generated voices from training data always garbled.... but works fine using tortoise-tts-fast ... (?)

If you're ooming while generating try lowering your Sample Batch Size in settings. It's possible your fancy magic super duper accuracy boost is correct, but it is the whisper transcribe and…

2023-03-11 03:05:45 +07:00

tigi6346 commented on issue mrq/ai-voice-cloning#113

Generated voices from training data always garbled.... but works fine using tortoise-tts-fast ... (?)

Regarding latents and "Leveraging LJSpeech dataset for computing latents," does that mean that the latents are sensitive to the quality of the generated dataset by whisper/whisperx/whispercpp? So…

2023-03-11 02:22:00 +07:00

tigi6346 reopened pull request mrq/ai-voice-cloning#112

master

2023-03-11 02:17:17 +07:00

tigi6346 closed pull request mrq/ai-voice-cloning#112

master

2023-03-11 02:14:36 +07:00

tigi6346 created pull request mrq/ai-voice-cloning#112

master

2023-03-11 01:38:53 +07:00

tigi6346 pushed to master at tigi6346/ai-voice-cloning

2023-03-11 01:37:31 +07:00

tigi6346 pushed to master at tigi6346/ai-voice-cloning

2023-03-11 01:37:13 +07:00

tigi6346 created repository tigi6346/ai-voice-cloning

2023-03-11 01:36:45 +07:00