|
2dfef693c4
|
comments for clarity
|
2025-03-16 11:30:23 -05:00 |
|
|
0a45c9c042
|
fix attention backend not being used
|
2025-02-27 21:38:38 -06:00 |
|
|
eff180248c
|
decoupled llama backend to avoid any funny changes from transformers, removed other backends since i dont think i'll ever bother using them
|
2025-02-27 19:00:37 -06:00 |
|
|
cbd4d7d7f4
|
ugh
|
2025-02-26 21:31:10 -06:00 |
|
|
2ea387c08a
|
segregated experimental changes into its own streamlined file to avoid breaking the existing model, and it can pivot to the cleaned up code if it actually works (nothing is working)
|
2025-02-26 21:26:13 -06:00 |
|
|
95da4e9405
|
made muon actually work by actually utilizing param groups (thanks APOLLO for reminding me this is the sane way to handle this split)
|
2025-02-26 10:39:13 -06:00 |
|
|
de27115bb7
|
there's something wrong with it on my 4xV100 rig......
|
2025-02-25 15:14:08 -06:00 |
|
|
a5a04c39ef
|
when the
|
2025-02-24 21:03:23 -06:00 |
|
|
918e0dbac1
|
small slop cleanup
|
2025-02-24 19:03:53 -06:00 |
|
|
0f39f4d7a1
|
lol
|
2025-02-24 17:51:35 -06:00 |
|
|
33d5a7109a
|
its a miracle i was able to get a semblance of audio with the naive AudioEncoder (now it interleaves properly)
|
2025-02-24 14:39:12 -06:00 |
|
|
8f5a3997bd
|
another experimental flag
|
2025-02-24 13:50:41 -06:00 |
|
|
b640fabab5
|
borrowed muon since it might better work under deepspeed and not require cruft (even though it really does not like the masked-NAR, also make the masked-NAR faux-causal since it might better help out for cfg.model.version >= 7
|
2025-02-23 17:23:24 -06:00 |
|
|
8f3c3e01ee
|
oops
|
2025-02-23 12:09:56 -06:00 |
|
|
b39aaacd77
|
oops
|
2025-02-23 11:55:43 -06:00 |
|
|
3019c88799
|
separate mask token and stop token because this might cause issues
|
2025-02-23 11:36:32 -06:00 |
|
|
6634d07576
|
added muon optimizer through kludge hacks because it necessitates a second optimizer in tandum that seems to only sometimes work with deepspeed
|
2025-02-23 11:22:13 -06:00 |
|
|
ab0abd2b12
|
fixes fixes fixes (a quarter of my recently processed audio returned zero'd tensors......)
|
2025-02-22 09:07:33 -06:00 |
|
|
13c3a08853
|
nevermind thats slow
|
2025-02-14 16:35:17 -06:00 |
|
|
285e493b12
|
ugh..........
|
2025-02-14 16:24:34 -06:00 |
|
|
a65c8144f4
|
with the amount of tweaks I keep making I could have probably had the nvidia/audio-codec-44khz model realized already......
|
2025-02-13 18:38:40 -06:00 |
|
|
e3becec0e8
|
more better-er loss calc I suppose
|
2025-02-13 12:49:53 -06:00 |
|
|
e8f182b634
|
cleaned up loss calc code (it REALLY hates ignore_loss_for_inputs, but is fine with splitting with loss factors)
|
2025-02-13 09:35:27 -06:00 |
|
|
319ca09a4f
|
cleanup
|
2025-02-12 23:36:32 -06:00 |
|
|
b52c5c5d80
|
this seems to work in testing
|
2025-02-12 16:16:04 -06:00 |
|
|
e029a8804d
|
ironically none of this cruft gets the loss lower than the original way
|
2025-02-12 11:17:00 -06:00 |
|
|
4b31f5c808
|
this seems preferable
|
2025-02-12 00:36:50 -06:00 |
|
|
04fef5dad5
|
agony
|
2025-02-12 00:18:24 -06:00 |
|
|
79c504c278
|
cleaned up encode/decode functions to make them a little more coherent, added option to batch encode/decode (would have been very nice in the past, but this should speed things up for me when i fall for the latest meme codec)
|
2025-02-05 20:54:31 -06:00 |
|
|
bb2ebe1ca2
|
fixed issues that may rise from updating transformers with attention, added nvidia/audio-codec-44khz backend support (by gutting everything necessary because I do NOT want to install more dependencies
|
2025-02-04 20:30:07 -06:00 |
|
|
69c1d2991f
|
updated mixtral backend (need this for something else)
|
2025-01-20 21:50:56 -06:00 |
|
|
b445f4abb6
|
experimental
|
2025-01-05 19:05:00 -06:00 |
|
|
2e6a7625e4
|
experimental
|
2025-01-05 12:47:03 -06:00 |
|
|
9b0d2ccbe1
|
|
2024-12-26 21:42:17 -06:00 |
|
|
59f56ad099
|
cleaup
|
2024-12-24 23:14:32 -06:00 |
|
|
82e8592f2a
|
working vall_e.cpp
|
2024-12-24 17:54:48 -06:00 |
|
|
497bdfc67b
|
more work (the wall is non-causal decoding......)
|
2024-12-22 20:11:31 -06:00 |
|
|
5f289db275
|
ugh
|
2024-12-22 16:15:24 -06:00 |
|
|
0d4329d2e3
|
sanity cleanup
|
2024-12-22 15:05:45 -06:00 |
|
|
353e478e68
|
agony
|
2024-12-21 22:52:10 -06:00 |
|
|
91caf00212
|
ugh
|
2024-12-20 17:13:37 -06:00 |
|
|
59bf6b8b33
|
exposed additional task (ns, sr, vc) (vc is experimental)
|
2024-12-20 11:15:29 -06:00 |
|
|
e7e7f48043
|
livid
|
2024-12-19 19:25:27 -06:00 |
|
|
cddf8ca814
|
sort batches to try and reduce number of padded tokens in batched inference (also commented out F5 samples getting added to the demo page because I would have to regenerate them)
|
2024-12-11 22:45:38 -06:00 |
|
|
61ed662856
|
ACTUALLY actually fix KD-loss (the -inf in the logits was caused by cringecode)
|
2024-12-07 12:31:54 -06:00 |
|
|
34a66e1052
|
agnostified KD
|
2024-12-06 23:53:46 -06:00 |
|
|
953d3eb030
|
ugh
|
2024-12-06 22:35:30 -06:00 |
|
|
42fafbaaca
|
actually fixed knowledge distillation because of errant -inf logits causing problems and needed to be filtered (and splitting text language / output audio language because it helps)
|
2024-12-06 21:55:20 -06:00 |
|
|
23d402bf01
|
added knowledge distillation in the trainer (sadly it is not agnostic because of the grave mistake of further processing the batch within the forward pass, so subsequent calls do not match......)
|
2024-12-05 23:05:52 -06:00 |
|
|
84a05acb6d
|
touch ups in docs
|
2024-12-02 19:10:42 -06:00 |
|