vall-e

ecker/vall-e

Fork 0

Commit Graph

Select branches

Hide Pull Requests

master

#1

#1

#2

#25

#25

c56ce033d9 work on an interleaved AR (spoiler: it does not work) mrq 2023-09-03 21:27:58 -0500
8a6c203277 added per-speaker samplers mrq 2023-09-03 21:27:13 -0500
81b05dabb9 accurate epoch metric is now reported (based on samples processed / length of dataset's paths, rather than naive assumptions) mrq 2023-09-03 08:03:36 -0500
922404285c fixed segfault from tts-c task token exceeding being too big (inserted it in the hypothetical svc task token because in reality that is never ever going to be a feasible task to train against) mrq 2023-09-02 19:25:43 -0500
4613781e23 integrated plot script, added tts-c task token to help the model be able to mix between normal VALL-E and VALL-E continuous mrq 2023-09-02 16:29:53 -0500
f7e942ec99 modified plotting script to be more agnostic to X mrq 2023-09-02 13:59:43 -0500
71e68a8528 tweaked tts-continuous task mrq 2023-09-02 13:39:17 -0500
21e5d250cc fixed up plot script that I forgot about mrq 2023-09-02 13:31:04 -0500
57db3ccfa8 shuffled VALL-E continuous as a task tts-c instead, logic fixes for it mrq 2023-09-02 12:23:40 -0500
2f06166ddd cleanups mrq 2023-09-01 21:33:51 -0500
e40c0d34a0 somewhat got recurrent forward working (it's as accurate as chunkwise forward: it's not accurate at all), added option to use AMP instead of blanket setting the weight's dtype mrq 2023-09-01 20:58:29 -0500
2bc2d08b09 (need to verify) added modifying model size and config bool to align with VALL-E continuous' methodology mrq 2023-09-01 17:19:34 -0500
5c8694db8e nasty bandaid if there's no validation dataset specified during training (for example, during finetunes) mrq 2023-08-30 18:23:05 -0500
7f4388e591 added total samples processed and tokens processed (len of text tokens + len of target response tokens) mrq 2023-08-28 11:02:45 -0500
87c4bfedba added ability to mark models as disabled for training, and hotloading them for eval/validation (useful if training only one model, or training a model per GPU) mrq 2023-08-27 12:26:12 -0500
165a1154e0 Undo naive=False test flag, this shouldn't have made its way in mrq 2023-08-26 22:00:43 -0500
78378ed1ce overhauled dataloading code to be marginally faster, mostly cleaned up, and can leverage a metadata json to help things out mrq 2023-08-26 19:53:23 -0500
7b3be3d7bf added helper scripts to process LibriTTS/LibriLight, detect duplicate speaker+books between them, and script to directly phonemize and quantize LibriTTS mrq 2023-08-26 10:21:12 -0500
16e0020901 disabled chunkwise_recurrent for 2x speed gains (I suppose it has been working the entire time, but I have not been properly grabbing things, and this might explain why the output is bad) mrq 2023-08-25 19:50:19 -0500
6455a2f9d7 I think I fixed a bug? mrq 2023-08-24 23:33:36 -0500
f3fbed5ffd updated notices tailored for windows / low VRAM cards mrq 2023-08-24 17:19:10 -0500
0517d620b8 fixes with the local backend mrq 2023-08-24 17:05:56 -0500
00ad4af651 updated draconian requirement for espeak-ng to be installed and the env var set to the dll for Windows mrq 2023-08-24 14:57:01 -0500
b6c9686f7d Do not install DeepSpeed under Windows (to-do: default backend to use local if on Windows) mrq 2023-08-24 14:27:36 -0500
22904a8639 more oversights fixed because I've been using a cached dataloader forever now and didn't catch these problems mrq 2023-08-24 10:25:33 -0500
5873c27f1a ops mrq 2023-08-24 09:20:47 -0500
501a857d5d ops mrq 2023-08-23 17:03:25 -0500
4585824cd3 tweaks, including exporting on save/quit mrq 2023-08-23 16:43:03 -0500
d106598403 do not utilize diskcache if a config yaml is not loaded mrq 2023-08-23 11:02:15 -0500
524d289c9c Forgot to re-add in setting the weight's dtype on model load mrq 2023-08-22 22:57:23 -0500
9c5a33bfd2 added repo with my weights so far mrq 2023-08-22 13:09:44 -0500
7b1b82e0e5 inferencing cleanup mrq 2023-08-20 21:36:02 -0500
a47029065b I don't know if the lack of start/stop tokens being added was causing my inference tests to fail, but it seems better now mrq 2023-08-20 19:21:54 -0500
736c077282 ops mrq 2023-08-20 13:42:18 -0500
b105f6211e added ability to export weights mid-training to avoid CBT to yank the weights while the training script is running mrq 2023-08-20 13:39:58 -0500
fc576010ce wrapped saving the checkpoint in a try/catch so I can stop waking up to the damn trainer crashing because it ran out of disk space; I'd much rather it keep training to give me time to eventually clear up disk space rather than it silently restarting on its own mrq 2023-08-20 06:29:17 -0500
2d1a9f10c0 nightmare of spaghetti that might break compat; mechanism to increase RVQ bins of an existing model without retraining, keeps sampled proms/resps at max RVQ level and trim off excess levels according to what model receives them, some other things I already forgot (I really hope no one else has weights being baked right now) mrq 2023-08-19 15:06:33 -0500
f7f6d3bf6d validated that SpeechX tasks cse and nse works, added a method to test each task by invoking python3 -m vall_e.data --action=tasks --tasks='sr,se,cse,nse' mrq 2023-08-19 09:50:07 -0500
6ca347e1e1 literally had a urethra moment before going to bed with a way to implement cse/nse tasks mrq 2023-08-19 01:16:46 -0500
8f42c578c9 setting up for allowing training for a partial amount of the speechx tasks (do NOT try this at home yet without a proper model, as performance is predecated on having a solid base vall-e model for the tasks mrq 2023-08-19 00:16:08 -0500
ae9d38aa31 forgot to have it pull from specified noise to the hdf5 dataset mrq 2023-08-18 23:57:07 -0500
77292c42f9 tested the training preparation for tasks ns, sr, and tse (I don't expect it to go well with only 2 RVQ bins) mrq 2023-08-18 23:55:40 -0500
bbb0563b3d pseudocode polyfill stub some other flavor of working on adding the tasks mrq 2023-08-18 22:22:13 -0500
0b46c1e312 god I am inexperienced with retaining compat from previous weights, I hope no one actually has weights mrq 2023-08-18 21:29:20 -0500
508677fcd5 repaired auraloss loss calc during eval/val mrq 2023-08-18 21:19:47 -0500
fb4e816823 oops mrq 2023-08-18 21:11:19 -0500
2a71486cb6 preparing for SpeechX extensions mrq 2023-08-18 20:58:07 -0500
ced31fd9b7 removed the sampler as it's very misleading mrq 2023-08-18 14:47:48 -0500
8e7f900210 forgot the = mrq 2023-08-17 19:07:59 -0500
3ff7cf8341 maybe fix evaluation dataset not being capped to cfg.evaluation.size mrq 2023-08-17 18:56:37 -0500
ee58db746f actually make the evaluation dataset shuffled for sample_type=speaker mrq 2023-08-17 15:04:45 -0500
18403a3523 maybe fixes eval dataloader not shuffling under distributed mrq 2023-08-17 13:41:53 -0500
03872b823f why did I type rglob, another 10 bucks down the drain... mrq 2023-08-17 00:11:29 -0500
b5f247aa11 just nuked about 9 hours of progress because I didn't make sure it pruned only on the global leader mrq 2023-08-16 23:37:52 -0500
d7152fc7b9 added pruning of old checkpoints if specified (cfg.trainer.keep_last_checkpoints) mrq 2023-08-16 20:12:12 -0500
44c08d828e added sample_type that samples from speakers to truly balance an epoch by speakers rather than the entire dataset and a sampler that tries to balance by speakers mrq 2023-08-16 19:39:21 -0500
599e47a813 might fix user inputted saving/quitting breaking when distributed mrq 2023-08-15 23:52:20 -0500
1e3e1d9315 tweaks mrq 2023-08-15 21:58:16 -0500
277c759ab1 fixed issue with non-distributed training, oops mrq 2023-08-14 21:42:35 -0500
5fa86182b5 oops mrq 2023-08-14 10:50:40 -0500
13571380be made exporter make more sense mrq 2023-08-13 22:56:28 -0500
d7deaf6def distributed training works now (hopefully) mrq 2023-08-13 22:07:45 -0500
2af09d0bef fixed that mysterious discepancy between the reported losses (I am so freaking mad, my piss is boiling, I had to interrupt halfway through an epoch) mrq 2023-08-05 15:25:41 -0500
d1b9770d41 set model to eval when inferencing (very important) mrq 2023-08-05 04:29:05 +0000
d89568a96e some fixes for the local framework mrq 2023-08-05 03:22:15 +0000
5970f254e3 some fixes for the local framework mrq 2023-08-05 02:17:30 +0000
012f54b7f1 another classic commit so i can copy it to another machine to gut out things and use the trainer bits for a side project that I should really get around to working on sooner than later mrq 2023-08-04 14:21:30 -0500
0a524f1d59 reticulating splines mrq 2023-08-03 21:39:00 -0500
608c1970eb ops mrq 2023-08-03 20:36:19 -0500
c85101403f big cleanup mrq 2023-08-03 20:26:36 -0500
2e03e5ac93 Fixed an issue with having fairseq installed at all will brick logging mrq 2023-08-02 22:57:10 -0500
f6597e2dfe adjustments mrq 2023-08-02 18:36:26 -0500
0f9b81de75 oops mrq 2023-08-02 18:12:36 -0500
7a06b27a9c Tweaks mrq 2023-08-02 22:06:39 +0000
d88e43800b adjustments mrq 2023-08-02 22:01:49 +0000
bf8cedc9dd Rewrite init mrq 2023-08-02 21:53:35 +0000
e94fdc10ed

Merge a31aefe698 into 3476d393d2 Orjwan Zaafarani 2023-01-22 18:37:15 +0300
a31aefe698 Updated README.md oturki 2023-01-22 18:33:38 +0300
d3900682d0 Updated README.md oturki 2023-01-22 18:32:49 +0300
fee11cfe16 Created a Dockerfile and updated the README.md oturki 2023-01-22 18:31:32 +0300
3476d393d2

Update README.md Zhe Niu 2023-01-19 08:23:10 +0800
5548571917

Add Colab Zhe Niu 2023-01-19 02:11:43 +0800
3b2304228c Only do empty test for training enhuiz 2023-01-19 01:20:39 +0800
2639ba9cab Simplify enhuiz 2023-01-19 00:03:51 +0800
d80ef1d970 Remove unused test dl enhuiz 2023-01-18 23:52:49 +0800
2e9f5030a3 Don't cache dataloader by default, raise if there is no valid path enhuiz 2023-01-18 10:20:20 +0800
2a68378421 Avoid implicit seeding for better continue training enhuiz 2023-01-17 22:47:22 +0800
6b8da6cb05 Update utils, logging real grad norm enhuiz 2023-01-17 22:30:38 +0800
b9a949a534 Fix docs enhuiz 2023-01-17 19:39:33 +0800
81da0ceaa3 Add model exporting and CLI, update docs and testing data enhuiz 2023-01-17 19:38:13 +0800
16dd5f65f6 Fix enhuiz 2023-01-16 19:50:22 +0800
4c5ebd2f1e Add data enhuiz 2023-01-16 16:44:10 +0800
035f48d670 Fix device enhuiz 2023-01-16 16:39:16 +0800
b5e1ab8057 Fix a difference in NAR implementation enhuiz 2023-01-16 16:34:05 +0800
8188506440 Use different sampling temperature for AR and NAR enhuiz 2023-01-16 12:35:02 +0800
77b52e42ce Batch accumulation. Ignore prompt and text loss in NAR and prompt loss in AR. Sampling temperature. enhuiz 2023-01-16 02:01:00 +0800
b7d3c89d6d Add __init__.py enhuiz 2023-01-15 22:02:36 +0800
f6c6df00b5 Update default configs enhuiz 2023-01-15 12:14:33 +0800
7f99d692c6 Add a plot script enhuiz 2023-01-15 11:42:00 +0800
f33ce7e5a9 Update requirements enhuiz 2023-01-14 17:23:35 +0800

Commit Graph Select branches Hide Pull Requests master #1 #1 #2 #25 #25 Mono Color

Commit Graph

Select branches

Hide Pull Requests

master

#1

#1

#2

#25

#25