Commit Graph

237 Commits

Author SHA1 Message Date
mrq
ddbacde0d1 DAC just doesn't work well enough...... 2024-05-25 11:07:52 -05:00
mrq
e3ef89f5aa 100x better for subtrain/eval to be by group instead 2024-05-19 16:40:14 -05:00
mrq
458b95d196 added option to split between text loss and audio loss (to-do: document this better), because it may or may not be a problem with LLaMA-backed models because my loss hovers around 3.9 / 56% accuracy despite sounding decent at the moment 2024-05-19 11:23:56 -05:00
mrq
74e531d391 ugh 2024-05-18 12:02:56 -05:00
mrq
59ef9461f8 ugh 2024-05-18 10:13:58 -05:00
mrq
4bc7e5a6d1 fix loading without needing an hdf5 dataset already prepped (and some other incidental speedups during dataloader prep) 2024-05-18 07:14:26 -05:00
mrq
d88a5ca183 ugh 2024-05-16 07:25:33 -05:00
mrq
d9aabfa3ae final tweaks, hopefully, again 2024-05-15 23:04:19 -05:00
mrq
8d79f78e0a god I need to replace omegaconf 2024-05-12 14:01:52 -05:00
mrq
5eb5db7f7f just don't use DAC 24Khz, it's bad 2024-05-12 13:41:17 -05:00
mrq
230da8b559 should be the final things to scramble around for, DAC's 24KHz model is unusable for this, but both encodec's 24KHz and DAC's 44KHz work 2024-05-12 13:22:08 -05:00
mrq
2437a86efa ugh 2024-05-12 13:02:15 -05:00
mrq
4f1593c8db a bunch of shit to salvage my old encodec-quantized audio because dac-encoded audio just does not want to converge 2024-05-12 10:17:29 -05:00
mrq
917eeb40d2 ughhh 2024-05-12 08:22:39 -05:00
mrq
9910c75d5a checkpointing for bitnet impl 2024-05-12 07:52:54 -05:00
mrq
14709ac67f ughh 2024-05-12 07:30:59 -05:00
mrq
3774fcbdee ugh 2024-05-11 22:58:38 -05:00
mrq
856545f8bb nan loss detection (should have added it earlier), loss scaling for local backend + fp16 2024-05-11 22:23:29 -05:00
mrq
a755eb3c62 ugh 2024-05-11 17:34:45 -05:00
mrq
88e9b9caff local ddp fix 2024-05-11 17:29:01 -05:00
mrq
3337c69e5a leverage between xformers and torch.backends.cuda.sdp_kernel for attention 2024-05-11 17:14:05 -05:00
mrq
d33c7bb7cf ugh 2024-05-11 16:47:19 -05:00
mrq
0b6499601b sanitizing 2024-05-11 16:31:05 -05:00
mrq
71e373064f remove redundant loss, tweak readme 2024-05-11 15:02:47 -05:00
mrq
04a80d6b55 maybe it's better to be more explicit in deepspeed configs 2024-05-11 13:57:43 -05:00
mrq
4d93a16ef7 might just be better to explicitly define prompt duration ranges, especially under a "train small contexts then increase it" training paradigm 2024-05-11 09:50:54 -05:00
mrq
bd0a36ba8d I swear I keep seeing tqdm flicker back a number 2024-05-10 18:36:01 -05:00
mrq
2109712e5b resolve deprecation warning that doesn't show on my old training rig but does on my new one 2024-05-09 23:25:44 -05:00
mrq
1547de5020 haha... 2024-05-09 23:15:52 -05:00
mrq
b7bd885651 some possible sanity with deepspeed config 2024-05-09 22:48:42 -05:00
mrq
c4b696ebeb oops 2024-05-09 22:33:40 -05:00
mrq
c22a177cf8 forgot to pass warmup to schedule free 2024-05-09 22:18:49 -05:00
mrq
b6131565ad autotune? 2024-05-09 21:25:40 -05:00
mrq
6ed6ab8c03 a bit more cleanup for deepspeed ds_cfg creation 2024-05-09 21:00:26 -05:00
mrq
0d5d545a40 crammed in DAdaptation (doesn't seem worth it) and ScheduleFree (forgot I wanted to weeks ago, seems promising), optimization wrapper cleanup, test trainer changes, etc. 2024-05-09 20:28:20 -05:00
mrq
c6e0f905b5 final tweaks (again) before training restarts 2024-05-08 02:11:38 -05:00
mrq
215800484d correcting my wrong of assuming I could just use raw 24Khz audio in the 44Khz DAC without too much of an issue (there are issues) 2024-05-04 23:49:15 -05:00
mrq
9f738fbd5b seems I actually don't need RVQ bins 9-32 with the 24Khz DAC model........ (time to requantize my audio...) 2024-05-04 23:09:18 -05:00
mrq
33b7f81b94 small cleanups 2024-05-04 22:37:22 -05:00
mrq
8aa1b2dabf documentation update 2024-05-04 21:03:46 -05:00
mrq
253441b750 forgot to disable verbose flag 2024-05-04 13:13:52 -05:00
mrq
3dca1125f5 implemented xformers in HF's Llama (because theres no flash attention for Volta cards) 2024-05-04 13:07:45 -05:00
mrq
277dcec484 apparently I got an error for trying to serialize an errant tensor that made its way into the json, this could be remedied easily with recursively traversing the dict and coercing any objects to primitives, but I'm tired and I just want to start training and nap 2024-05-04 12:33:43 -05:00
mrq
ffa200eec7 added option to specify frames per second for the given audio representation (Encodec is 75Hz, DAC is 41Hz (at 24K sources)) 2024-05-04 12:05:41 -05:00
mrq
c494894261 simple DDP wrapper (for my NVlink test) 2024-05-04 11:48:26 -05:00
mrq
783db3d2c5 forgot to commit the DAC test utterance 2024-05-04 09:46:51 -05:00
mrq
a7b43b98b5 renamed cfg.bitsandbytes to cfg.optimizations (and having it serve as cfg.optimizations.bitsandbytes) 2024-05-02 20:08:59 -05:00
mrq
b5d1456a09 backwards compat for my shitty old weights (was testing if disabling AudioEmbedding summing magically made things better (it did not)) 2024-04-29 22:14:01 -05:00
mrq
5120ffdda7 god it would be nice to know the best way to handle audio embeddings, because I genuinely don't know without skimming through papers or devoting X amount of GPU hours in training 2024-04-29 18:24:05 -05:00
mrq
6a11bc9cb6 update tokenizer because, for some reason, it had the wrong order for the special tokens to where eos = unk 2024-04-29 09:09:26 -05:00