• https://git.ecker.tech/ aims to provide a place to share my efforts while maintaining true ownership of my code, as I do not trust GitHub.

    XMR: 4B9TQdkAkBFYrbj5ztvTx89e5LpucPeTSPzemCihdDi9EBnx7btn8RDNZTBz2zihWsjMnDkzn5As1LU6gLv3KQy8BLsZ8SG

  • Joined on 2022-10-10
mrq pushed to master at mrq/vall-e 2025-02-25 00:59:22 +00:00
918e0dbac1 small slop cleanup
mrq pushed to master at mrq/vall-e 2025-02-25 00:20:57 +00:00
3330b5bb00 maybe fix NaNs being thrown for immature models at fp16 for training evals
mrq pushed to master at mrq/vall-e 2025-02-24 23:47:40 +00:00
mrq pushed to master at mrq/vall-e 2025-02-24 20:34:20 +00:00
33d5a7109a its a miracle i was able to get a semblance of audio with the naive AudioEncoder (now it interleaves properly)
mrq pushed to master at mrq/vall-e 2025-02-24 19:49:27 +00:00
mrq pushed to master at mrq/vall-e 2025-02-24 19:47:00 +00:00
8f5a3997bd another experimental flag
mrq pushed to master at mrq/vall-e 2025-02-24 19:43:53 +00:00
99ef55d605 override eval for meme model
mrq pushed to master at mrq/vall-e 2025-02-24 19:39:17 +00:00
f93fbf0d99 another experimental flag
mrq pushed to master at mrq/vall-e 2025-02-24 03:16:36 +00:00
mrq pushed to master at mrq/vall-e 2025-02-24 01:09:15 +00:00
cbf6b84e27 fixed grad norm and loss scale not reporting for local trainer
mrq pushed to master at mrq/vall-e 2025-02-23 23:21:14 +00:00
b640fabab5 borrowed muon since it might better work under deepspeed and not require cruft (even though it really does not like the masked-NAR, also make the masked-NAR faux-causal since it might better help out for cfg.model.version >= 7
mrq pushed to master at mrq/vall-e 2025-02-23 18:27:47 +00:00
mrq pushed to master at mrq/vall-e 2025-02-23 18:04:58 +00:00
8f3c3e01ee oops
mrq pushed to master at mrq/vall-e 2025-02-23 17:50:57 +00:00
b39aaacd77 oops
mrq pushed to master at mrq/vall-e 2025-02-23 17:45:03 +00:00
504b1ae832 undo separating mask and stop token, this causes bigly problems...
mrq pushed to master at mrq/vall-e 2025-02-23 17:31:45 +00:00
3019c88799 separate mask token and stop token because this might cause issues
mrq pushed to master at mrq/vall-e 2025-02-23 17:17:38 +00:00
6634d07576 added muon optimizer through kludge hacks because it necessitates a second optimizer in tandum that seems to only sometimes work with deepspeed
mrq pushed to master at mrq/vall-e 2025-02-23 14:26:19 +00:00
67a6009555 (finally) added parallel AR for cfg.model.version >= 7 (nvidia/audio-codec-44khz is being a pain and it might require training purely AR first......)
mrq pushed to master at mrq/vall-e 2025-02-22 20:04:45 +00:00
15b3c20e19 also throw exception for zero'd out tensor during training (I am very paranoid now)
mrq pushed to master at mrq/vall-e 2025-02-22 15:02:43 +00:00
ab0abd2b12 fixes fixes fixes (a quarter of my recently processed audio returned zero'd tensors......)