-
https://git.ecker.tech/ aims to provide a place to share my efforts while maintaining true ownership of my code, as I do not trust GitHub.
XMR: 4B9TQdkAkBFYrbj5ztvTx89e5LpucPeTSPzemCihdDi9EBnx7btn8RDNZTBz2zihWsjMnDkzn5As1LU6gLv3KQy8BLsZ8SG
- Joined on
2022-10-10
Playing around with encodec encoding + vocos decoding. As good as vocos is, it still gives some minor audio artifacts for higher pitch voices. This puts an upperbound on the quality of the…
I cloned the fast fork and edited the autoregressive.py to use deepspeed. I saw some speed ups that were pretty nice. Sometimes as much as 7-10 seconds speed ups (compared to your fork). But on…
I had gotten that from here, but I think yeah it is just plain incorrect and probably closer to the number you gave.
Seems like someone ran someone else's article (and not the paper itself)…
It looks like the original vall-e model used ~140B parameters.
Where'd you get that number from? The papers (VALL-E, VALL-E X, SpeechX) don't mention a parameter count anywhere.
[NaturalSpe…
I think I've got a good wrangling of any electrical-related issues over painful trial and error and isolation over the past few days. Turns out there's quite the rabbit hole that I just so…
.\venv\Scripts\activate.bat
pip3 install git+https://github.com/m-bain/whisperX
is the basic way to do it, but you pretty much need to cross your fingers and hope that all the…
I would increase the temperature as 0.2 is a bit low for TorToiSe. I imagine that's the case, because I remember the base model will erase any non-American accents.
I'm just posting to inform you that vast.ai is just a nugget for GPU cloud, often 3x cheaper than runpod for 3090/4090/A40. The trick is to activate "Unverified Machines"
Ah I see, I didn't…
Shit, I could have sworn I merged this after seeing it a few hours after being submitted a few days ago. Gomen.
mmm... I think it's foolish to continue running training on the existing weights.
- even before with the rental 4090s/3090s, the metrics never improved, they're just wavering between the ranges…
My gut says:
- the finetune is being trained too fast, as your initial LR is too high / your LR is not decaying fast enough.
- the finetune is also not being trained long enough. 2300 steps /…
In the training YAML, copy over what's in the dataset.training
into the dataset.validation
. I could have sworn I had it fall back and do this itself for the validation dataset/dataloader, but…
The issue I've ran into with naively using Japanese is that there's a problem with the way the default tokenizer will normalize Japanese text (it will convert kana/kanji the wrong way). I honestly…