James Betker
f563a8dd41
fixes
2022-03-15 21:43:00 -06:00
James Betker
1e3a8554a1
updates to audio_diffusion_fid
2022-03-15 11:35:09 -06:00
James Betker
7929fd89de
Refactor audio-style models into the audio folder
2022-03-15 11:06:25 -06:00
James Betker
e045fb0ad7
fix clip grad norm with scaler
2022-03-13 16:28:23 -06:00
James Betker
08599b4c75
fix random_audio_crop injector
2022-03-12 20:42:29 -07:00
James Betker
c4e4cf91a0
add support for the original vocoder to audio_diffusion_fid; also add a new "intelligibility" metric
2022-03-08 15:53:27 -07:00
James Betker
3e5da71b16
add grad scaler scale to metrics
2022-03-08 15:52:42 -07:00
James Betker
d1dc8dbb35
Support tts9
2022-03-05 20:14:36 -07:00
James Betker
93a3302819
Push training_state data to CPU memory before saving it
...
For whatever reason, keeping this on GPU memory just doesn't work.
When you load it, it consumes a large amount of GPU memory and that
utilization doesn't go away. Saving to CPU should fix this.
2022-03-04 17:57:33 -07:00
James Betker
6000580e2e
df
2022-03-04 13:47:00 -07:00
James Betker
382681a35d
Load diffusion_fid DVAE into the correct cuda device
2022-03-04 13:42:14 -07:00
James Betker
e1052a5e32
Move log consensus to train for efficiency
2022-03-04 13:41:32 -07:00
James Betker
ce6dfdf255
Distributed "fixes"
2022-03-04 12:46:41 -07:00
James Betker
3ff878ae85
Accumulate loss & grad_norm metrics from all entities within a distributed graph
2022-03-04 12:01:16 -07:00
James Betker
f87e10ffef
Make deterministic sampler work with distributed training & microbatches
2022-03-04 11:50:50 -07:00
James Betker
2d1cb83c1d
Add a deterministic timestep sampler, with provisions to employ it every n steps
2022-03-04 10:40:14 -07:00
James Betker
f490eaeba7
Shuffle optimizer states back and forth between cpu memory during steps
2022-03-04 10:38:51 -07:00
James Betker
3c242403f5
adjust location of pre-optimizer step so I can visualize the new grad norms
2022-03-04 08:56:42 -07:00
James Betker
58019a2ce3
audio diffusion fid updates
2022-03-03 21:53:32 -07:00
James Betker
6873ad6660
Support functionality
2022-03-03 21:52:16 -07:00
James Betker
70fa780edb
Add mechanism to export grad norms
2022-03-01 20:19:52 -07:00
James Betker
db0c3340ac
Implement guidance-free diffusion in eval
...
And a few other fixes
2022-03-01 11:49:36 -07:00
James Betker
2134f06516
Implement conditioning-free diffusion at the eval level
2022-02-27 15:11:42 -07:00
James Betker
ac920798bb
misc
2022-02-27 14:49:11 -07:00
James Betker
f458f5d8f1
abort early if losses reach nan too much, and save the model
2022-02-24 20:55:30 -07:00
James Betker
18dc62453f
Don't step if NaN losses are encountered.
2022-02-24 17:45:08 -07:00
James Betker
7c17c8e674
gurgl
2022-02-23 21:28:24 -07:00
James Betker
81017d9696
put frechet_distance on cuda
2022-02-23 21:21:13 -07:00
James Betker
9a7bbf33df
f
2022-02-23 18:03:38 -07:00
James Betker
b7319ab518
Support vocoder type diffusion in audio_diffusion_fid
2022-02-23 17:25:16 -07:00
James Betker
58f6c9805b
adf
2022-02-22 23:12:58 -07:00
James Betker
03752c1cd6
Report NaN
2022-02-22 23:09:37 -07:00
James Betker
6313a94f96
eval: integrate a n-gram language model into decoding
2022-02-21 19:12:34 -07:00
James Betker
7b12799370
Reformat mel_text_clip for use in eval
2022-02-19 20:37:26 -07:00
James Betker
bcba65c539
DataParallel Fix
2022-02-19 20:36:35 -07:00
James Betker
34001ad765
et
2022-02-18 18:52:33 -07:00
James Betker
a813fbed9c
Update to evaluator
2022-02-17 17:30:33 -07:00
James Betker
79e8f36d30
Convert CLIP models into new folder
2022-02-15 20:53:07 -07:00
James Betker
8f767b8b4f
...
2022-02-15 07:08:17 -07:00
James Betker
29e07913a8
Fix
2022-02-15 06:58:11 -07:00
James Betker
dd585df772
LAMB optimizer
2022-02-15 06:48:13 -07:00
James Betker
2bdb515068
A few mods to make wav2vec2 trainable with DDP on DLAS
2022-02-15 06:28:54 -07:00
James Betker
52b61b9f77
Update scripts and attempt to figure out how UnifiedVoice could be used to produce CTC codes
2022-02-13 20:48:06 -07:00
James Betker
a4f1641eea
Add & refine WER evaluator for w2v
2022-02-13 20:47:29 -07:00
James Betker
e16af944c0
BSO fix
2022-02-12 20:01:04 -07:00
James Betker
15fd60aad3
Allow EMA training to be disabled
2022-02-12 20:00:23 -07:00
James Betker
102142d1eb
f
2022-02-11 11:05:13 -07:00
James Betker
40b08a52d0
dafuk
2022-02-11 11:01:31 -07:00
James Betker
f6a7f12cad
Remove broken evaluator
2022-02-11 11:00:29 -07:00
James Betker
46b97049dc
Fix eval
2022-02-11 10:59:32 -07:00
James Betker
5175b7d91a
training sweeper checkin
2022-02-11 10:46:37 -07:00
James Betker
d1d1ae32a1
audio diffusion frechet distance measurement!
2022-02-10 22:55:46 -07:00
James Betker
23a310b488
Fix BSO
2022-02-10 20:54:51 -07:00
James Betker
1e28e02f98
BSO improvement to make it work with distributed optimizers
2022-02-10 09:53:13 -07:00
James Betker
836eb08afb
Update BSO to use the proper step size
2022-02-10 09:44:15 -07:00
James Betker
3d946356f8
batch_size_optimizer works. sweet! no more tuning batch sizes.
2022-02-09 14:26:23 -07:00
James Betker
18938248e4
Add batch_size_optimizer support
2022-02-08 23:51:31 -07:00
James Betker
de1a1d501a
Move audio injectors into their own file
2022-02-03 21:42:37 -07:00
James Betker
fbea6e8eac
Adjustments to diffusion networks
2022-01-30 16:14:06 -07:00
James Betker
798ed7730a
i like wasting time
2022-01-24 18:12:08 -07:00
James Betker
fc09cff4b3
angry
2022-01-24 18:09:29 -07:00
James Betker
cc0d9f7216
Fix
2022-01-24 18:05:45 -07:00
James Betker
3a9e3a9db3
consolidate state
2022-01-24 17:59:31 -07:00
James Betker
dfef34ba39
Load ema to cpu memory if specified
2022-01-24 15:08:29 -07:00
James Betker
49edffb6ad
Revise device mapping
2022-01-24 15:08:13 -07:00
James Betker
33511243d5
load model state dicts into the correct device
...
it's not clear to me that this will make a huge difference, but it's a good idea anyways
2022-01-24 14:40:09 -07:00
James Betker
3e16c509f6
Misc fixes
2022-01-24 14:31:43 -07:00
James Betker
e420df479f
Allow steps to specify which state keys to carry forward (reducing memory utilization)
2022-01-24 11:01:27 -07:00
James Betker
62475005e4
Sort data items in descending order, which I suspect will improve performance because we will hit GC less
2022-01-23 19:05:32 -07:00
James Betker
8f48848f91
misc
2022-01-22 08:23:29 -07:00
James Betker
ce929a6b3f
Allow grad scaler to be enabled even in fp32 mode
2022-01-21 23:13:24 -07:00
James Betker
bcd8cc51e1
Enable collated data for diffusion purposes
2022-01-19 00:35:08 -07:00
James Betker
894d245062
More zero_grad fixes
2022-01-08 20:31:19 -07:00
James Betker
2a9a25e6e7
Fix likely defective nan grad recovery
2022-01-08 18:24:58 -07:00
James Betker
65ffe38fce
misc
2022-01-06 22:16:17 -07:00
James Betker
f4484fd155
Add "dataset_debugger" support
...
This allows the datasets themselves compile statistics and report them
via tensorboard and wandb.
2022-01-06 12:38:20 -07:00
James Betker
b12f47b36d
Add some noise to voice_voice_clip
2021-12-29 13:56:30 -07:00
James Betker
64cb4a92db
Support adamw_zero
2021-12-25 21:32:01 -07:00
James Betker
776a7abfcc
Support torch DDP _set_static_graph
2021-12-25 21:20:06 -07:00
James Betker
62c8ed9a29
move speech utils
2021-12-16 20:47:37 -07:00
James Betker
e7957e4897
Make loss accumulator for logs accumulate better
2021-12-12 22:23:17 -07:00
James Betker
76f86c0e47
gaussian_diffusion: support fp16
2021-12-12 19:52:21 -07:00
James Betker
aa7cfd1edf
Add support for mel norms across the channel dim
2021-12-12 19:52:08 -07:00
James Betker
63bf135b93
Support norms
2021-12-11 08:30:49 -07:00
James Betker
5a664aa56e
misc
2021-12-11 08:17:26 -07:00
James Betker
306274245b
Also do dynamic range compression across mel
2021-12-10 20:06:24 -07:00
James Betker
faf55684b8
Use slaney norm in the mel filterbank computation
2021-12-10 20:04:52 -07:00
James Betker
32cfcf3684
Turn off optimization in find_faulty_files
2021-12-09 09:02:09 -07:00
James Betker
9191201f05
asd
2021-12-07 09:55:39 -07:00
James Betker
ef15a39841
fix gdi bug?
2021-12-07 09:53:48 -07:00
James Betker
68e9db12b5
Add interleaving and direct injectors
2021-12-02 21:04:49 -07:00
James Betker
47fe032a3d
Try to make diffusion validator more reproducible
2021-11-24 09:38:10 -07:00
James Betker
934395d4b8
A few fixes for gpt_asr_hf2
2021-11-23 09:29:29 -07:00
James Betker
973f47c525
misc nonfunctional
2021-11-22 17:16:39 -07:00
James Betker
3125ca38f5
Further wandb logs
2021-11-22 16:40:19 -07:00
James Betker
0604060580
Finish up mods for next version of GptAsrHf
2021-11-20 21:33:49 -07:00
James Betker
14f3155ec4
misc
2021-11-20 17:45:14 -07:00
James Betker
687e0746b3
Add Torch-derived MelSpectrogramInjector
2021-11-18 20:02:45 -07:00
James Betker
c30a38cdf1
Undo baseline GDI changes
2021-11-18 20:02:09 -07:00
James Betker
f36bab95dd
Audio resample injector
2021-11-10 20:06:33 -07:00