9e3f2e300fexperimental "just have a token for what rvq level we're on" that seems to help all models (mamba almost works, but it might just have to be relegated as a pure AR model)mrq2024-06-04 23:23:31 -0500
e0886c5a78re-added mamba as a possible non-experimental arch backend (test trainer will set it as AR only, doing any NAR tasks lobotomizes it)mrq2024-06-04 22:41:22 -0500
687c71e028disable accuracy calc because it breaks with actual batched training even though it shouldn'tmrq2024-06-04 22:13:44 -0500
ed3aeaf3a1copy pasted from test to actual trainermrq2024-06-04 18:40:30 -0500
0aa01ba31aforgot one crucial detail (you *need* the previous RVQ level to keep coherence between all RVQ levels) (experimental deinterleaved is a bit crusty though)mrq2024-06-04 18:30:30 -0500
7feeb944a0probably insane with even entertaining going this routemrq2024-06-03 20:26:27 -0500
c2a436d368somehow between training sessions grad_norm = None even though it worked beforemrq2024-06-02 08:29:27 -0500
c1fcd889d5reverted automatically disabling split loss calc, since it seems that it's actually cacling loss on prom causes the oddities, maybemrq2024-06-01 12:34:59 -0500
39bc019142actually save per-rank sampler statesmrq2024-06-01 09:46:32 -0500
74df2f5332split sampler dict by global_rank, also handle splitting dataset paths by global_rank if sampler_type == path (because I do not trust DistributedSampler) (need to test)mrq2024-06-01 09:29:49 -0500
31785f4eebactually don't default to compute split losses, test bitnet model doesn't seem to be doing things right (despite debug printouts showing theyre roughly the same logit/loss sequences, could just be bitnet linears being not up to par on actual models)mrq2024-06-01 09:12:51 -0500
b482ca19ffadded model config option to set KV head count for MQA/GQA instead of MHA for llama-based models (i think its very negligible both ways on such a small model size)mrq2024-05-31 19:32:37 -0500
da473295b7better way to compute per-segment lossesmrq2024-05-28 19:29:54 -0500
6c49ad06a3forgot to reinclude mult by loss factorsmrq2024-05-27 20:40:21 -0500
b82f0d5c0cfinally nailed the issue that caused logging to break on one machine but not another (bitnet includes zetascale which is a parasite that will break logging)mrq2024-05-27 19:47:58 -0500
d760924719added kludgy eval only so I don't have to start training, type eval, stop training, then delete the logs for that sessionmrq2024-05-25 17:39:51 -0500
ddbacde0d1DAC just doesn't work well enough......mrq2024-05-25 11:07:52 -0500
e3ef89f5aa100x better for subtrain/eval to be by group insteadmrq2024-05-19 16:40:14 -0500
458b95d196added option to split between text loss and audio loss (to-do: document this better), because it may or may not be a problem with LLaMA-backed models because my loss hovers around 3.9 / 56% accuracy despite sounding decent at the momentmrq2024-05-19 11:23:56 -0500
4bc7e5a6d1fix loading without needing an hdf5 dataset already prepped (and some other incidental speedups during dataloader prep)mrq2024-05-18 07:14:26 -0500
8d79f78e0agod I need to replace omegaconfmrq2024-05-12 14:01:52 -0500
5eb5db7f7fjust don't use DAC 24Khz, it's badmrq2024-05-12 13:41:17 -0500
230da8b559should be the final things to scramble around for, DAC's 24KHz model is unusable for this, but both encodec's 24KHz and DAC's 44KHz workmrq2024-05-12 13:22:08 -0500
4f1593c8dba bunch of shit to salvage my old encodec-quantized audio because dac-encoded audio just does not want to convergemrq2024-05-12 10:17:29 -0500
04a80d6b55maybe it's better to be more explicit in deepspeed configsmrq2024-05-11 13:57:43 -0500
4d93a16ef7might just be better to explicitly define prompt duration ranges, especially under a "train small contexts then increase it" training paradigmmrq2024-05-11 09:50:54 -0500
bd0a36ba8dI swear I keep seeing tqdm flicker back a numbermrq2024-05-10 18:36:01 -0500
2109712e5bresolve deprecation warning that doesn't show on my old training rig but does on my new onemrq2024-05-09 23:25:44 -0500
6ed6ab8c03a bit more cleanup for deepspeed ds_cfg creationmrq2024-05-09 21:00:26 -0500
0d5d545a40crammed in DAdaptation (doesn't seem worth it) and ScheduleFree (forgot I wanted to weeks ago, seems promising), optimization wrapper cleanup, test trainer changes, etc.mrq2024-05-09 20:28:20 -0500
c6e0f905b5final tweaks (again) before training restartsmrq2024-05-08 02:11:38 -0500
215800484dcorrecting my wrong of assuming I could just use raw 24Khz audio in the 44Khz DAC without too much of an issue (there are issues)mrq2024-05-04 23:49:15 -0500
9f738fbd5bseems I actually don't need RVQ bins 9-32 with the 24Khz DAC model........ (time to requantize my audio...)mrq2024-05-04 23:09:18 -0500
253441b750forgot to disable verbose flagmrq2024-05-04 13:13:52 -0500
3dca1125f5implemented xformers in HF's Llama (because theres no flash attention for Volta cards)mrq2024-05-04 13:07:45 -0500
277dcec484apparently I got an error for trying to serialize an errant tensor that made its way into the json, this could be remedied easily with recursively traversing the dict and coercing any objects to primitives, but I'm tired and I just want to start training and napmrq2024-05-04 12:33:43 -0500
ffa200eec7added option to specify frames per second for the given audio representation (Encodec is 75Hz, DAC is 41Hz (at 24K sources))mrq2024-05-04 12:05:41 -0500
c494894261simple DDP wrapper (for my NVlink test)mrq2024-05-04 11:48:26 -0500
783db3d2c5forgot to commit the DAC test utterancemrq2024-05-04 09:46:51 -0500
a7b43b98b5renamed cfg.bitsandbytes to cfg.optimizations (and having it serve as cfg.optimizations.bitsandbytes)mrq2024-05-02 20:08:59 -0500
b5d1456a09backwards compat for my shitty old weights (was testing if disabling AudioEmbedding summing magically made things better (it did not))mrq2024-04-29 22:14:01 -0500
5120ffdda7god it would be nice to know the best way to handle audio embeddings, because I genuinely don't know without skimming through papers or devoting X amount of GPU hours in trainingmrq2024-04-29 18:24:05 -0500
6a11bc9cb6update tokenizer because, for some reason, it had the wrong order for the special tokens to where eos = unkmrq2024-04-29 09:09:26 -0500
57810e4ba4metadata only path (might drop HDF5 since its giving file sizes twice as large as my actual unpacked dataset)mrq2024-04-28 23:03:09 -0500
ffc334cf58added dataset transcription helper script (now I don't ever have to touch ai-voice-cloning) (to-do: unify scripts into the module)mrq2024-04-21 17:43:20 -0500
b251669536forgot to fix up the test trainermrq2024-04-21 14:58:04 -0500
071fb97777dataset preparation script updates, caved and am using HF tokenizer nowmrq2024-04-21 14:49:18 -0500
a8ffa88844it slipped my mind that technically DAC can be used at any sample rate, since it models waveforms; make it a config YAML option to allow this behaviormrq2024-04-19 18:36:54 -0500
00804a47e9Forgot to copy intermediary dataset conversion scriptmrq2024-04-18 21:34:28 -0500
8214aa23d7converting over to a different intermediary dataset formatmrq2024-04-18 21:24:06 -0500
4f5c9e518aactually use the passed-through sample rate from encode for DAC because it does its own resampling I guessmrq2024-04-18 13:32:41 -0500
2e9e6e68f7Forgot I need to use the DAC's 44K model because 24K model has 32 codebooks instead of 9.mrq2024-04-17 20:59:25 -0500
5ff2b4aab5finally swallowing the Descript-Audio-Codec pill (I guess I'm going to have to regenerate my entire dataset)mrq2024-04-17 20:39:35 -0500