• https://git.ecker.tech/ aims to provide a place to share my efforts while maintaining true ownership of my code, as I do not trust GitHub.

    XMR: 4B9TQdkAkBFYrbj5ztvTx89e5LpucPeTSPzemCihdDi9EBnx7btn8RDNZTBz2zihWsjMnDkzn5As1LU6gLv3KQy8BLsZ8SG

  • Joined on 2022-10-10
mrq commented on issue mrq/ai-voice-cloning#357 2023-08-29 20:22:54 +00:00
Finished training, but it hasn't

In the Training > Prepare Dataset tab, you can click "(Re)Create Datasets" for it force regenerate the train.txt file again, although it's odd that it seemed to have done the prior steps…

mrq commented on issue mrq/ai-voice-cloning#353 2023-08-29 12:28:28 +00:00
Resume training with expanded dataset

into an issue where the model stops training after only 10 minutes

Oh, I suppose this is a bit of a mismatch between what is considered an "epoch" between my side creating the YAML, and the…

mrq commented on issue mrq/ai-voice-cloning#356 2023-08-28 19:52:37 +00:00
Loss gets stuck after resume training

This is more of a cosmetic issue than a functional one.

The printout on the left shows your loss at 0.563, which corresponds to the bottom of the green line on the right. Your LR is 3.239e-06,…

mrq commented on issue mrq/ai-voice-cloning#352 2023-08-28 16:26:07 +00:00
Best ways to get rid of static?

Given the graph, loss curve, and LR curve, I think your LR scheduling might have been too lax and ended up frying the finetune from the LR decaying very slowly. The default scheduling should be…

mrq commented on issue mrq/ai-voice-cloning#353 2023-08-28 16:22:29 +00:00
Resume training with expanded dataset

The training script should be able to resume training from the last checkpoint without needing to update anything else, even if you modified the dataset.

The "Resume Training" or whatever it…

mrq commented on issue mrq/ai-voice-cloning#354 2023-08-28 16:17:58 +00:00
Illegal Instruction after setting quality to anything above ultra-fast?

mmm...

Verify what your batch size is set to in settings. If it's something higher than 16, then the error message might just be a misnomer, and the true issue is that you're just running out…

mrq commented on issue mrq/ai-voice-cloning#355 2023-08-28 16:11:01 +00:00
RAM memory leak

Ah yeah. I'm not too sure where the issue lies, as even using the repo to transcribe large datasets will eventually hang after a while, from presumably a memory leak somewhere. I'm just not too…

mrq pushed to master at mrq/vall-e 2023-08-28 16:01:51 +00:00
7f4388e591 added total samples processed and tokens processed (len of text tokens + len of target response tokens)
mrq pushed to master at mrq/vall-e 2023-08-27 17:25:11 +00:00
87c4bfedba added ability to mark models as disabled for training, and hotloading them for eval/validation (useful if training only one model, or training a model per GPU)
mrq commented on issue mrq/ai-voice-cloning#152 2023-08-27 03:16:07 +00:00
VALL-E Integration (and In Response To TorToiSe: a Quick Retrospective)

I think I've got everything I wanted to do done before the next training session, so I can just leave the GPUs (yes, plural) training and shutting up for a while (or at least not overworking…

mrq pushed to master at mrq/vall-e 2023-08-27 02:59:38 +00:00
165a1154e0 Undo naive=False test flag, this shouldn't have made its way in
mrq commented on issue mrq/vall-e#6 2023-08-27 02:14:36 +00:00
Training error: RuntimeError: Could not infer dtype of NoneType

You'll either need to:

mrq pushed to master at mrq/vall-e 2023-08-27 00:52:23 +00:00
78378ed1ce overhauled dataloading code to be marginally faster, mostly cleaned up, and can leverage a metadata json to help things out
mrq pushed to master at mrq/vall-e 2023-08-26 15:20:40 +00:00
7b3be3d7bf added helper scripts to process LibriTTS/LibriLight, detect duplicate speaker+books between them, and script to directly phonemize and quantize LibriTTS
mrq commented on issue mrq/ai-voice-cloning#349 2023-08-26 14:58:40 +00:00
unload_tts() doesnt unload the voice model from video memory
def unload_tts():
	global tts

	if tts:
		del tts
		tts = None
		print("Unloaded TTS")
	do_gc()

I'm pretty sure there's a magical Python issue where this isn't actually working…

mrq commented on issue mrq/ai-voice-cloning#152 2023-08-26 02:03:46 +00:00
VALL-E Integration (and In Response To TorToiSe: a Quick Retrospective)

Additionally, while trying to make recurrent_forward work, I managed to finally fix the issue with inferencing. It seems that chunkwise_recurrent does in fact work, and it was actually…

mrq pushed to master at mrq/vall-e 2023-08-26 00:49:07 +00:00
16e0020901 disabled chunkwise_recurrent for 2x speed gains (I suppose it has been working the entire time, but I have not been properly grabbing things, and this might explain why the output is bad)
mrq pushed to master at mrq/ai-voice-cloning 2023-08-26 00:02:24 +00:00
690947ad36 Do not double phonemize if using VALL-E backend (I wonder how many hours I've wasted from this oversight)
mrq commented on issue mrq/ai-voice-cloning#348 2023-08-25 18:27:29 +00:00
Is it possible to train faster tortoise tts with this?

Any model (autoregressive.pth) trained through the web UI (trained with DLAS) are all compatible with the base TorToiSe and the forks. I haven't checked base TorToiSe in a while to see if it…

mrq commented on issue mrq/ai-voice-cloning#152 2023-08-25 18:15:36 +00:00
VALL-E Integration (and In Response To TorToiSe: a Quick Retrospective)

Idle hands are truly the devil's workshop.

I'm getting tempted to make another poor purchase decision. My gut wants to go with a 7900XTX despite:

  • already knowing there's an inherent design…