• https://git.ecker.tech/ aims to provide a place to share my efforts while maintaining true ownership of my code, as I do not trust GitHub.

    XMR: 4B9TQdkAkBFYrbj5ztvTx89e5LpucPeTSPzemCihdDi9EBnx7btn8RDNZTBz2zihWsjMnDkzn5As1LU6gLv3KQy8BLsZ8SG

  • Joined on 2022-10-10
mrq pushed to master at mrq/vall-e 2024-11-13 17:34:36 +00:00
269648605e move NAR-len rvq level 0 to separate embedding
mrq pushed to master at mrq/vall-e 2024-11-13 17:05:17 +00:00
29e45be0b4 tweaks to bucket sampling
mrq pushed to master at mrq/vall-e 2024-11-13 16:43:57 +00:00
aed09bb211 tweaks to bucket sampling
mrq pushed to master at mrq/vall-e 2024-11-13 16:39:37 +00:00
5a35e82fb1 tweaks to bucket sampling
mrq pushed to master at mrq/vall-e 2024-11-13 16:31:20 +00:00
mrq pushed to master at mrq/vall-e 2024-11-13 16:13:36 +00:00
be83ddabaa better causal-ness for split loss calc, and also do masking for NAR-len for it
mrq pushed to master at mrq/vall-e 2024-11-13 15:49:57 +00:00
mrq pushed to master at mrq/vall-e 2024-11-13 15:39:36 +00:00
ad7cfffc00 NAR-len RVQ-0 was being trained causally.............
mrq pushed to master at mrq/vall-e 2024-11-13 15:05:21 +00:00
976ee87f6f resume iteration step in tqdm trainer, warn to logger if the sampler state dict was invalidated
mrq pushed to master at mrq/vall-e 2024-11-13 15:03:06 +00:00
8286aa54c8 do not pass timestep token/embedding since it doesn't seem to matter at all after all, fixed training masking rate to 80% because a paper said so
mrq pushed to master at mrq/vall-e 2024-11-13 04:26:30 +00:00
caf721c67b set it to zero because it'll make the stop token hide more often than not
mrq pushed to master at mrq/vall-e 2024-11-13 04:25:48 +00:00
0f2584eba7 new meme sampler PogChamp new meme sampler PogChamp (it sort of helps?)
mrq pushed to master at mrq/vall-e 2024-11-13 04:24:16 +00:00
e96a5f0117 new meme sampler PogChamp new meme sampler PogChamp (it sort of helps?)
mrq pushed to master at mrq/vall-e 2024-11-12 22:38:01 +00:00
663f07038d haha... (do not create a token dropout/noise mask when not training (this sadly didnt fix NAR-len output))
mrq pushed to master at mrq/vall-e 2024-11-12 19:38:18 +00:00
b09328069e actually do CFG sampling for base AR+NAR tasks
mrq pushed to master at mrq/vall-e 2024-11-12 18:45:42 +00:00
2495a7ef67 Fixed STT in the web UI
mrq pushed to master at mrq/vall-e 2024-11-12 03:36:00 +00:00
8927bad7bc actually fixed rep pen (for ar and nar, it seems to help with nar unmasking)
mrq pushed to master at mrq/vall-e 2024-11-12 02:35:26 +00:00
ec92613847 actually pass input prompt length size to inference
mrq pushed to master at mrq/vall-e 2024-11-12 02:30:49 +00:00
b1df6a7bed reverted rep pen sampler due to a regression
mrq pushed to master at mrq/vall-e 2024-11-12 02:23:22 +00:00
b1f4db39c8 threw in CFG sampling for normal model as well to experiment with