|
e859a7c01d
|
experimental multi-gpu training (Linux only, because I can't into batch files)
|
2023-03-03 04:37:18 +00:00 |
|
|
c956d81baf
|
added button to just load a training set's loss information, added installing broncotc/bitsandbytes-rocm when running setup-rocm.sh
|
2023-03-02 01:35:12 +00:00 |
|
|
534a761e49
|
added loading/saving of voice latents by model hash, so no more needing to manually regenerate every time you change models
|
2023-03-02 00:46:52 +00:00 |
|
|
5a41db978e
|
oops
|
2023-03-01 19:39:43 +00:00 |
|
|
b989123bd4
|
leverage tensorboard to parse tb_logger files when starting training (it seems to give a nicer resolution of training data, need to see about reading it directly while training)
|
2023-03-01 19:32:11 +00:00 |
|
|
c2726fa0d4
|
added new training tunable: loss_text_ce_loss weight, added option to specify source model in case you want to finetune a finetuned model (for example, train a Japanese finetune on a large dataset, then finetune for a specific voice, need to truly validate if it produces usable output), some bug fixes that came up for some reason now and not earlier
|
2023-03-01 01:17:38 +00:00 |
|
|
5037752059
|
oops
|
2023-02-28 22:13:21 +00:00 |
|
|
787b44807a
|
added to embedded metadata: datetime, model path, model hash
|
2023-02-28 15:36:06 +00:00 |
|
|
81eb58f0d6
|
show different losses, rewordings
|
2023-02-28 06:18:18 +00:00 |
|
|
fda47156ec
|
oops
|
2023-02-28 01:08:07 +00:00 |
|
|
bc0d9ab3ed
|
added graph to chart loss_gpt_total rate, added option to prune X number of previous models/states, something else
|
2023-02-28 01:01:50 +00:00 |
|
|
6925ec731b
|
I don't remember.
|
2023-02-27 19:20:06 +00:00 |
|
|
92553973be
|
Added option to disable bitsandbytesoptimizations for systems that do not support it (systems without a Turing-onward Nvidia card), saves use of float16 and bitsandbytes for training into the config json
|
2023-02-26 01:57:56 +00:00 |
|
|
aafeb9f96a
|
actually fixed the training output text parser
|
2023-02-25 16:44:25 +00:00 |
|
|
65329dba31
|
oops, epoch increments twice
|
2023-02-25 15:31:18 +00:00 |
|
|
8b4da29d5f
|
csome adjustments to the training output parser, now updates per iteration for really large batches (like the one I'm doing for a dataset size of 19420)
|
2023-02-25 13:55:25 +00:00 |
|
|
d5d8821a9d
|
fixed some files not copying for bitsandbytes (I was wrong to assume it copied folders too), fixed stopping generating and training, some other thing that I forgot since it's been slowly worked on in my small free times
|
2023-02-24 23:13:13 +00:00 |
|
|
2104dbdbc5
|
ops
|
2023-02-24 13:05:08 +00:00 |
|
|
f6d0b66e10
|
finally added model refresh button, also searches in the training folder for outputted models so you don't even need to copy them
|
2023-02-24 12:58:41 +00:00 |
|
|
1e0fec4358
|
god i finally found some time and focus: reworded print/save freq per epoch => print/save freq (in epochs), added import config button to reread the last used settings (will check for the output folder's configs first, then the generated ones) and auto-grab the last resume state (if available), some other cleanups i genuinely don't remember what I did when I spaced out for 20 minutes
|
2023-02-23 23:22:23 +00:00 |
|
|
7d1220e83e
|
forgot to mult by batch size
|
2023-02-23 15:38:04 +00:00 |
|
|
487f2ebf32
|
fixed the brain worm discrepancy between epochs, iterations, and steps
|
2023-02-23 15:31:43 +00:00 |
|
|
1cbcf14cff
|
oops
|
2023-02-23 13:18:51 +00:00 |
|
|
225dee22d4
|
huge success
|
2023-02-23 06:24:54 +00:00 |
|
|
526a430c2a
|
how did this revert...
|
2023-02-22 13:24:03 +00:00 |
|
|
93b061fb4d
|
oops
|
2023-02-22 03:21:03 +00:00 |
|
|
c4b41e07fa
|
properly placed the line toe xtract starting iteration
|
2023-02-22 01:17:09 +00:00 |
|
|
fefc7aba03
|
oops
|
2023-02-21 22:13:30 +00:00 |
|
|
9e64dad785
|
clamp batch size to sample count when generating for the sickos that want that, added setting to remove non-final output after a generation, something else I forgot already
|
2023-02-21 21:50:05 +00:00 |
|
|
f119993fb5
|
explicitly use python3 because some OSs will not have python alias to python3, allow batch size 1
|
2023-02-21 20:20:52 +00:00 |
|
|
8a1a48f31e
|
Added very experimental float16 training for cards with not enough VRAM (10GiB and below, maybe) \!NOTE\! this is VERY EXPERIMETNAL, I have zero free time to validate it right now, I'll do it later
|
2023-02-21 19:31:57 +00:00 |
|
|
ed2cf9f5ee
|
wrap checking for metadata when adding a voice in case it throws an error
|
2023-02-21 17:35:30 +00:00 |
|
|
b6f7aa6264
|
fixes
|
2023-02-21 04:22:11 +00:00 |
|
|
bbc2d26289
|
I finally figured out how to fix gr.Dropdown.change, so a lot of dumb UI decisions are fixed and makes sense
|
2023-02-21 03:00:45 +00:00 |
|
|
1fd88afcca
|
updated notebook for newer setup structure, added formatting of getting it/s and lass loss rate (have not tested loss rate yet)
|
2023-02-20 22:56:39 +00:00 |
|
|
37ffa60d14
|
brain worms forgot a global, hate global semantics
|
2023-02-20 15:31:38 +00:00 |
|
|
d17f6fafb0
|
clean up, reordered, added some rather liberal loading/unloading auxiliary models, can't really focus right now to keep testing it, report any issues and I'll get around to it
|
2023-02-20 00:21:16 +00:00 |
|
|
c99cacec2e
|
oops
|
2023-02-19 23:29:12 +00:00 |
|
|
ee95616dfd
|
optimize batch sizes to be as evenly divisible as possible (noticed the calculated epochs mismatched the inputted epochs)
|
2023-02-19 21:06:14 +00:00 |
|
|
6260594a1e
|
Forgot to base print/save frequencies in terms of epochs in the UI, will get converted when saving the YAML
|
2023-02-19 20:38:00 +00:00 |
|
|
4694d622f4
|
doing something completely unrelated had me realize it's 1000x easier to just base things in terms of epochs, and calculate iteratsions from there
|
2023-02-19 20:22:03 +00:00 |
|
|
4f79b3724b
|
Fixed model setting not getting updated when TTS is unloaded, for when you change it and then load TTS (sorry for that brain worm)
|
2023-02-19 16:24:06 +00:00 |
|
|
092dd7b2d7
|
added more safeties and parameters to training yaml generator, I think I tested it extensively enough
|
2023-02-19 16:16:44 +00:00 |
|
|
d89b7d60e0
|
forgot to divide checkpoint freq by iterations to get checkpoint counts
|
2023-02-19 07:05:11 +00:00 |
|
|
485319c2bb
|
don't know what brain worms had me throw printing training output under verbose
|
2023-02-19 06:28:53 +00:00 |
|
|
debdf6049a
|
forgot to copy again from dev folder to git folder
|
2023-02-19 06:04:46 +00:00 |
|
|
ae5d4023aa
|
fix for (I assume) some inconsistency with gradio sometimes-but-not-all-the-time coercing an empty Textbox into an empty string or sometimes None, but I also assume that might be a deserialization issue from JSON (cannot be assed to ask people to screenshot UI or send their ./config/generation.json for analysis, so get this hot monkeyshit patch)
|
2023-02-19 06:02:47 +00:00 |
|
|
57060190af
|
absolutely detest global semantics
|
2023-02-19 05:12:09 +00:00 |
|
|
f44239a85a
|
added polyfill for loading autoregressive models in case mrq/tortoise-tts absolutely refuses to update
|
2023-02-19 05:10:08 +00:00 |
|
|
e7d0cfaa82
|
added some output parsing during training (print current iteration step, and checkpoint save), added option for verbose output (for debugging), added buffer size for output, full console output gets dumped on terminating training
|
2023-02-19 05:05:30 +00:00 |
|