James Betker
|
7f4fc55344
|
Update SR model
|
2022-02-03 21:42:53 -07:00 |
|
James Betker
|
de1a1d501a
|
Move audio injectors into their own file
|
2022-02-03 21:42:37 -07:00 |
|
James Betker
|
687393de59
|
Add a better split_on_silence (processing_pipeline)
Going to extend this a bit more going forwards to support the entire pipeline.
|
2022-02-03 20:00:26 -07:00 |
|
James Betker
|
1d29999648
|
Uupdates to the TTS production scripts
|
2022-02-03 20:00:01 -07:00 |
|
James Betker
|
bc506d4bcd
|
Mods to unet_diffusion_tts6 to support super resolution mode
|
2022-02-03 19:59:39 -07:00 |
|
James Betker
|
4249681c4b
|
Mods to support a autoregressive CTC code generator
|
2022-02-03 19:58:54 -07:00 |
|
James Betker
|
8132766d38
|
tts6
|
2022-01-31 20:15:06 -07:00 |
|
James Betker
|
fbea6e8eac
|
Adjustments to diffusion networks
|
2022-01-30 16:14:06 -07:00 |
|
James Betker
|
e58dab14c3
|
new diffusion updates from testing
|
2022-01-29 11:01:01 -07:00 |
|
James Betker
|
935a4e853e
|
get rid of nil tokens in <2>
|
2022-01-27 22:45:57 -07:00 |
|
James Betker
|
0152174c0e
|
Add wandb_step_factor argument
|
2022-01-27 19:58:58 -07:00 |
|
James Betker
|
e0e36ed98c
|
Update use_diffuse_tts
|
2022-01-27 19:57:28 -07:00 |
|
James Betker
|
a77d376ad2
|
rename unet diffusion tts and add 3
|
2022-01-27 19:56:24 -07:00 |
|
James Betker
|
7badbf1b4d
|
update usage scripts
|
2022-01-25 17:57:26 -07:00 |
|
James Betker
|
8c255811ad
|
more fixes
|
2022-01-25 17:57:16 -07:00 |
|
James Betker
|
0f3ca28e39
|
Allow diffusion model to be trained with masking tokens
|
2022-01-25 14:26:21 -07:00 |
|
James Betker
|
798ed7730a
|
i like wasting time
|
2022-01-24 18:12:08 -07:00 |
|
James Betker
|
fc09cff4b3
|
angry
|
2022-01-24 18:09:29 -07:00 |
|
James Betker
|
cc0d9f7216
|
Fix
|
2022-01-24 18:05:45 -07:00 |
|
James Betker
|
3a9e3a9db3
|
consolidate state
|
2022-01-24 17:59:31 -07:00 |
|
James Betker
|
dfef34ba39
|
Load ema to cpu memory if specified
|
2022-01-24 15:08:29 -07:00 |
|
James Betker
|
49edffb6ad
|
Revise device mapping
|
2022-01-24 15:08:13 -07:00 |
|
James Betker
|
33511243d5
|
load model state dicts into the correct device
it's not clear to me that this will make a huge difference, but it's a good idea anyways
|
2022-01-24 14:40:09 -07:00 |
|
James Betker
|
3e16c509f6
|
Misc fixes
|
2022-01-24 14:31:43 -07:00 |
|
James Betker
|
e2ed0adbd8
|
use_diffuse_tts updates
|
2022-01-24 14:31:28 -07:00 |
|
James Betker
|
e420df479f
|
Allow steps to specify which state keys to carry forward (reducing memory utilization)
|
2022-01-24 11:01:27 -07:00 |
|
James Betker
|
62475005e4
|
Sort data items in descending order, which I suspect will improve performance because we will hit GC less
|
2022-01-23 19:05:32 -07:00 |
|
James Betker
|
d18aec793a
|
Revert "(re) attempt diffusion checkpointing logic"
This reverts commit b22eec8fe3 .
|
2022-01-22 09:14:50 -07:00 |
|
James Betker
|
b22eec8fe3
|
(re) attempt diffusion checkpointing logic
|
2022-01-22 08:34:40 -07:00 |
|
James Betker
|
8f48848f91
|
misc
|
2022-01-22 08:23:29 -07:00 |
|
James Betker
|
851070075a
|
text<->cond clip
I need that universal clip..
|
2022-01-22 08:23:14 -07:00 |
|
James Betker
|
8ada52ccdc
|
Update LR layers to checkpoint better
|
2022-01-22 08:22:57 -07:00 |
|
James Betker
|
ce929a6b3f
|
Allow grad scaler to be enabled even in fp32 mode
|
2022-01-21 23:13:24 -07:00 |
|
James Betker
|
91b4b240ac
|
dont pickle unique files
|
2022-01-21 00:02:06 -07:00 |
|
James Betker
|
7fef7fb9ff
|
Update fast_paired_dataset to report how many audio files it is actually using
|
2022-01-20 21:49:38 -07:00 |
|
James Betker
|
ed35cfe393
|
Update inference scripts
|
2022-01-20 11:28:50 -07:00 |
|
James Betker
|
20312211e0
|
Fix bug in code alignment
|
2022-01-20 11:28:12 -07:00 |
|
James Betker
|
8e2439f50d
|
Decrease resolution requirements to 2048
|
2022-01-20 11:27:49 -07:00 |
|
James Betker
|
4af8525dc3
|
Adjust diffusion vocoder to allow training individual levels
|
2022-01-19 13:37:59 -07:00 |
|
James Betker
|
ac13bfefe8
|
use_diffuse_tts
|
2022-01-19 00:35:24 -07:00 |
|
James Betker
|
bcd8cc51e1
|
Enable collated data for diffusion purposes
|
2022-01-19 00:35:08 -07:00 |
|
James Betker
|
dc9cd8c206
|
Update use_gpt_tts to be usable with unified_voice2
|
2022-01-18 21:14:17 -07:00 |
|
James Betker
|
7b4544b83a
|
Add an experimental unet_diffusion_tts to perform experiments on
|
2022-01-18 08:38:24 -07:00 |
|
James Betker
|
b6190e96b2
|
fast_paired
|
2022-01-17 15:46:02 -07:00 |
|
James Betker
|
1d30d79e34
|
De-specify fast-paired-dataset
|
2022-01-16 21:20:00 -07:00 |
|
James Betker
|
2b36ca5f8e
|
Revert paired back
|
2022-01-16 21:10:46 -07:00 |
|
James Betker
|
ad3e7df086
|
Split the fast random into its own new dataset
|
2022-01-16 21:10:11 -07:00 |
|
James Betker
|
7331862755
|
Updated paired to randomly index data, offsetting memory costs and speeding up initialization
|
2022-01-16 21:09:22 -07:00 |
|
James Betker
|
37e4e737b5
|
a few fixes
|
2022-01-16 15:17:17 -07:00 |
|
James Betker
|
35db5ebf41
|
paired_voice_audio_dataset - aligned codes support
|
2022-01-15 17:38:26 -07:00 |
|