|
71731ed785
|
added prefixing with silence (was to test something, currently hidden under cfg.experimental=True)
|
2024-10-18 17:19:52 -05:00 |
|
|
75b90be325
|
cleaned up unused config flags, allow less strict yaml by pruning missing keys, renamed some dataset configs to be more unified
|
2024-10-17 17:06:48 -05:00 |
|
|
d0ab7d755a
|
added min-p (really does not seem useful since it's very sensitive), more tweaks to entropix
|
2024-10-11 22:36:06 -05:00 |
|
|
bef43a0c18
|
added experimental entropix sampling support
|
2024-10-11 21:18:26 -05:00 |
|
|
75a4c866d6
|
more demo page tweaks, added arg to force enable/disable LoRAs for inferencing (to-do: setup arg flags to handle this, and checkbox in web UI)
|
2024-10-10 19:04:12 -05:00 |
|
|
2ea978f318
|
added --eval-random-text-prompts to use random text prompts for eval pass, added --random-prompts for demo page and --lora to use a sample with the lora disabled, probably finally fixed validation dataloader breaking on eval
|
2024-10-10 13:40:25 -05:00 |
|
|
54547b74d8
|
experimental implementation of STT (need to actually test on a model, test trainer seems to work)
|
2024-09-05 20:43:20 -05:00 |
|
|
32287710a2
|
moved prints to use logger, edited readme (fused_attn doesnt seem stable for training)
|
2024-08-29 13:27:16 -05:00 |
|
|
3a65cc4b22
|
fix issue with sft and shared tensors...
|
2024-08-04 19:56:21 -05:00 |
|
|
545162195b
|
deprecate sole AR/NAR model by only keeping the AR+NAR (the beauty of no one using this is that I can break compat as much as I want), add tone token for when I classify my dataset with tone/emotion in the future, some other things
|
2024-04-15 19:54:32 -05:00 |
|
|
08bae355eb
|
actually use langs from the dataloader
|
2023-10-11 21:21:50 -05:00 |
|
|
8740cdefc6
|
added initial support for languages (still testing, marked as model version 3), added experimental 'context extend by limiting the resp context' (untested)
|
2023-10-11 20:38:40 -05:00 |
|
|
7facacf7c9
|
separated samplers into its own file, don't bother copying the logits back to the GPU after sampling, it's not necessary
|
2023-10-11 12:25:31 -05:00 |
|
|
e727b6e5c1
|
changed dynamic temperature trigger to be a min-(n)ar-temp value between [0,(n)ar-temp), flags to set min temp, checkbox in web UI to request it
|
2023-10-10 17:02:33 -05:00 |
|
|
c0b25541e3
|
restructured some things with the model to remove dead weights
|
2023-09-20 19:10:59 -05:00 |
|
|
a6bfe43590
|
added mirostat sampling (given a partially trained model, it got far decent output than I expected, need to test on a better trained model)
|
2023-09-18 18:55:41 -05:00 |
|
|
2567e082b5
|
UGH
|
2023-09-16 00:26:13 -05:00 |
|
|
4aef798135
|
added picking final candidate based on sum of score instead of first candidate (this changes nothing).
|
2023-09-13 13:19:11 -05:00 |
|
|
23a5fdd645
|
implemented a naive beam search (I really should be taking a break)
|
2023-09-12 21:28:07 -05:00 |
|
|
40ef34e1ca
|
this embedding class definitely works, and migrating from the previous embedding weights seems to work.
|
2023-09-11 14:13:42 -05:00 |
|
|
10c34c5b98
|
added a length-based decay factor for repetition penalty
|
2023-09-08 21:02:00 -05:00 |
|
|
14c78bae39
|
added lots of sampling options (top-k/top-p, repetition penalty, length penalty)
|
2023-09-08 20:30:54 -05:00 |
|
|
ab5134f385
|
tweaks and fixes
|
2023-09-07 17:08:38 -05:00 |
|
|
b2c2dec291
|
added homebrewed per-RVQ-bin embedding solutions
|
2023-09-07 16:48:02 -05:00 |
|
|
7ce06432fd
|
fixed the AR+NAR dual model, the resp_emb has to be split up (classifier might too)
|
2023-09-06 19:33:39 -05:00 |
|
|
100ca6b7d0
|
added option to use SGD optimizer through the YAML, added option to pass in additional optimizer parameters through the YAML, added experimental unified AR+NAR model (does not seem fruitful in testing)
|
2023-09-06 18:58:35 -05:00 |
|
|
2f9cd0842f
|
merged dedicated interleaved AR code with the normal AR code
|
2023-09-03 22:46:08 -05:00 |
|
|
8a6c203277
|
added per-speaker samplers
|
2023-09-03 21:27:13 -05:00 |
|
|
2f06166ddd
|
cleanups
|
2023-09-01 21:33:51 -05:00 |
|
|
e40c0d34a0
|
somewhat got recurrent forward working (it's as accurate as chunkwise forward: it's not accurate at all), added option to use AMP instead of blanket setting the weight's dtype
|
2023-09-01 20:58:29 -05:00 |
|
|
2bc2d08b09
|
(need to verify) added modifying model size and config bool to align with VALL-E continuous' methodology
|
2023-09-01 17:19:34 -05:00 |
|
|
165a1154e0
|
Undo naive=False test flag, this shouldn't have made its way in
|
2023-08-26 22:00:43 -05:00 |
|
|
78378ed1ce
|
overhauled dataloading code to be marginally faster, mostly cleaned up, and can leverage a metadata json to help things out
|
2023-08-26 19:53:23 -05:00 |
|
|
2d1a9f10c0
|
nightmare of spaghetti that might break compat; mechanism to increase RVQ bins of an existing model without retraining, keeps sampled proms/resps at max RVQ level and trim off excess levels according to what model receives them, some other things I already forgot (I really hope no one else has weights being baked right now)
|
2023-08-19 15:06:33 -05:00 |
|
|
2a71486cb6
|
preparing for SpeechX extensions
|
2023-08-18 20:58:07 -05:00 |
|
|
d7deaf6def
|
distributed training works now (hopefully)
|
2023-08-13 22:07:45 -05:00 |
|
|
c85101403f
|
big cleanup
|
2023-08-03 20:26:36 -05:00 |
|
|
2e03e5ac93
|
Fixed an issue with having fairseq installed at all will brick logging
|
2023-08-02 22:57:10 -05:00 |
|
|
7a06b27a9c
|
Tweaks
|
2023-08-02 22:06:39 +00:00 |
|