|
0cca4eb943
|
disable this cringe precheck for now since it causes problems
|
2025-05-22 13:21:36 -05:00 |
|
|
f12746b091
|
allow defining the default model name through env var, register nemo-larger in the model name list thing
|
2025-05-21 16:50:59 -05:00 |
|
|
e46d7ef2cb
|
warn and ignore export when lora training because the state dict exported during training is wrong
|
2025-05-20 23:38:10 -05:00 |
|
|
fee02f4153
|
added option to explicitly load a lora without having to lobotomize yourself with creating a yaml just to do so
|
2025-05-20 23:28:29 -05:00 |
|
|
5018ddb107
|
i dont know why this managed to escape my attention
|
2025-05-20 15:13:21 -05:00 |
|
|
b2b243e7e7
|
addresses #9
|
2025-05-05 13:03:44 -05:00 |
|
|
5fe01ffc6c
|
more notes / re-enabled top-k/p samplers for new implementation
|
2025-04-19 14:04:34 -05:00 |
|
|
f8e1d110dc
|
when you uhh when you for once use your main rig to test and forgot to and when you port things back over
|
2025-04-18 20:49:00 -05:00 |
|
|
d9e18037cc
|
new implementation tweaks and fixes to make it actually better (there were a lot of badwrong things being done that harmed the output quality, will evaluate the model further)
|
2025-04-18 20:36:44 -05:00 |
|
|
98d1d8cb1e
|
added some more notes, tweaks (RIP DAC, it's over)
|
2025-04-17 20:24:40 -05:00 |
|
|
9e27d2e02e
|
huggingface zerogpu cringe
|
2025-04-16 15:25:45 -05:00 |
|
|
814146a5e0
|
more settings bloat because there seems to be instability with the encoder as-is
|
2025-04-12 12:53:44 -05:00 |
|
|
f144389920
|
the culprit was initializing the level_weights for killing newly trained models.............
|
2025-04-10 23:06:16 -05:00 |
|
|
6c6a34dd21
|
i can't be assed to test if the prior commit works so being explicit like this should help until i can be bothered to halt training just to test this
|
2025-04-07 23:13:35 -05:00 |
|
|
6d42c9ae23
|
how foolish of me, not having a softmax as float32 (maybe addresses an emergent regression where bfloat16 training shits the bed where float16+loss scaling doesnt)
|
2025-04-07 22:51:52 -05:00 |
|
|
d6cd848c32
|
goodbye nvidia/audio-codec-44khz, crossed fingers for DAC again
|
2025-04-06 21:05:29 -05:00 |
|
|
1e22519d94
|
diagnosed both hf/llama.cpp versions to probably just being a faulty export method (to-do: migrate vall_e.models.base to vall_e.export --hf)
|
2025-04-05 22:05:39 -05:00 |
|
|
c34763769a
|
ugh
|
2025-04-05 18:58:25 -05:00 |
|
|
b6692ce3de
|
ugh
|
2025-04-05 18:20:46 -05:00 |
|
|
4a909ceff8
|
temp fix for vall_e.cpp demask scoring regression
|
2025-04-05 11:04:26 -05:00 |
|
|
44260f7445
|
tweaks
|
2025-04-05 10:27:07 -05:00 |
|
|
0ede3bfc12
|
updated vall_e.cpp, but i could have sworn it worked much better than this......
|
2025-04-05 01:22:51 -05:00 |
|
|
28d39ef962
|
should not be working late
|
2025-04-03 23:32:58 -05:00 |
|
|
bfe70e9d56
|
ugh
|
2025-04-03 23:26:00 -05:00 |
|
|
2e93438867
|
reintroduced sampler_type = speaker because I think this might salvage the nemo model to have better speaker similarities
|
2025-04-03 19:01:10 -05:00 |
|
|
caad99ab78
|
fix for bsz>1 because I forgot the old implementation implicitly handles this
|
2025-04-02 17:17:37 -05:00 |
|
|
068dbdb785
|
ugh
|
2025-04-02 17:05:16 -05:00 |
|
|
0e995dbf2c
|
is this my last cope (falling back to explicit duration prediction, as this regression just won't go away) (also the smaller model was lobotomized because of my ROCm setup having a botched SDPA for who knows why)
|
2025-04-02 17:01:24 -05:00 |
|
|
7a0956863d
|
oops
|
2025-03-31 21:11:43 -05:00 |
|
|
a1184586ef
|
should never have trusted mse_loss, it never works
|
2025-03-31 20:59:13 -05:00 |
|
|
99f251c768
|
slight tweaks to condition-less NS/SR
|
2025-03-30 10:37:40 -05:00 |
|
|
478aea0e8c
|
tweaks
|
2025-03-28 19:49:54 -05:00 |
|
|
6ae282e090
|
re-added noise dataloader sampler whatever for the old implementation's other tasks that require it
|
2025-03-28 15:07:06 -05:00 |
|
|
90b3509404
|
I'll just cope and say I cannot apply segmented attention masks to the smaller model as it's too trained on not doing it, and the regression came from dumb python aliasing rules
|
2025-03-27 13:27:51 -05:00 |
|
|
2fd82a7a22
|
cannot get segmented mask to actually work without gradients exploding (need to find a different way to do duration prediction...)
|
2025-03-27 00:51:41 -05:00 |
|
|
4d777b5618
|
add remark that segmented attention actually might be broken (for some reason this only emerged recently, need to investigate)
|
2025-03-26 12:08:47 -05:00 |
|
|
09e9438941
|
ugh
|
2025-03-25 23:24:01 -05:00 |
|
|
8641c87611
|
nothing could go wrong part 2 (reverted and rewrote commits since there was a nasty regression)
|
2025-03-25 23:06:16 -05:00 |
|
|
aa8b32d97e
|
added more notes (although I could have sworn I have had more notes that i can't recall)
|
2025-03-25 18:53:06 -05:00 |
|
|
df5b870908
|
added remark about not using sliding attention
|
2025-03-22 12:44:34 -05:00 |
|
|
02a8bcbe29
|
fixed errant index error (although it makes me wonder if my segmented masking is still flawed)
|
2025-03-21 23:41:34 -05:00 |
|
|
d1d91295b3
|
add segmented sliding attention, also found a bug with prom-less segments in the attention mask generation.........
|
2025-03-21 19:05:49 -05:00 |
|
|
589cfb0e18
|
yuge speedup because of a dumb oversight
|
2025-03-20 17:39:41 -05:00 |
|
|
8068f24e35
|
cleaned up parallel nar, i think it's slightly faster but even the smallest model is still slower than ar+nar-len-llama-8...
|
2025-03-20 15:56:15 -05:00 |
|
|
9a7458cf17
|
fixed inferencing since I did delete the len_emb, some more notes on the model since it seems I just had bad experimental settings
|
2025-03-19 22:41:48 -05:00 |
|
|
61de653ad9
|
now causal training should work again
|
2025-03-19 14:20:19 -05:00 |
|
|
85b9dd47c1
|
ugh
|
2025-03-19 13:31:50 -05:00 |
|
|
81acd565b3
|
re-enable these
|
2025-03-18 20:59:33 -05:00 |
|
|
5479d2eacc
|
more tweaks to the new implementation (properly trim the len stuff to save some params, decoder to d_ffn expansion to 2 to maybe also make it faster, etc.)
|
2025-03-18 19:34:37 -05:00 |
|
|
9a8a8e3195
|
off by one bateman
|
2025-03-18 08:40:43 -05:00 |
|