James Betker
3580c52eac
Fix up wavfile_dataset to be able to provide a full clip
2021-08-15 20:53:26 -06:00
James Betker
a523c4f932
Auto-normalize wav files by data type
2021-08-15 09:09:51 -06:00
James Betker
c28f657ab8
Allow usage of pre-rendered mels saved to npy files
2021-08-14 23:38:15 -06:00
James Betker
d6a73acaed
Allow processing of multiple audio sources at once from nv_tacotron_dataset
2021-08-14 16:04:05 -06:00
James Betker
007976082b
GPT_asr for inference
2021-08-14 14:37:17 -06:00
James Betker
81e91c99de
Misc
2021-08-13 13:58:59 -06:00
James Betker
d0c74278bf
Enable multiple wavfile paths to be specified, fix eps bug in mp3 splitter
2021-08-11 08:46:02 -06:00
James Betker
e19c00398e
More improvements to random_mp3_splitter
2021-08-09 21:31:12 -06:00
James Betker
4100469902
Add a tool to split mp3 files into arbitrary chunks of wav files
2021-08-08 23:23:13 -06:00
James Betker
690d7e86d3
Fix nv_tacotron_dataset bug which incorrectly mapped filenames
...
dammit..
2021-08-08 11:38:52 -06:00
James Betker
a2afb25e42
Fix inference, always flow full text tokens through transformer
2021-08-07 20:11:10 -06:00
James Betker
a7496b661c
combined dvae ftw
2021-08-06 22:01:06 -06:00
James Betker
b43683b772
Add lucidrains_dvae
2021-08-06 12:03:46 -06:00
James Betker
62c7570512
Constrain wav_aug a bit more
2021-08-06 08:19:38 -06:00
James Betker
f86df53ce0
Export extract_byol_model as a function
2021-08-05 22:15:26 -06:00
James Betker
d120e1aa99
Add audio augmentation to wavfile_dataset, utility to test audio similary
2021-08-05 22:14:49 -06:00
James Betker
c0f61a2e15
Rework how DVAE tokens are ordered
...
It might make more sense to have top tokens, then bottom tokens
with top tokens having different discretized values.
2021-08-05 07:07:17 -06:00
James Betker
36c7c1fbdb
Fix training flow for NEXT TOKEN prediction instead of same token prediction
...
doh
2021-08-04 10:28:09 -06:00
James Betker
d9936df363
Add gpt_tts dataset and implement inference
...
- Adds a script which preprocesses quantized mels given a DVAE
- Adds a dataset which can consume preprocessed qmels
- Reworks GPT TTS to consume the outputs of that dataset (removes logic to add padding and start/end tokens)
- Adds inference to gpt_tts
2021-08-04 00:44:04 -06:00
James Betker
4c98b9703f
Get dalle-style TTS to "work"
2021-08-03 21:08:27 -06:00
James Betker
0c9e75bc69
Improvements to GptTts
2021-07-31 15:57:57 -06:00
James Betker
31ee9ae262
Checkin
2021-07-30 23:07:35 -06:00
James Betker
2325e7a88c
Allow inference for vqvae
2021-07-20 10:40:05 -06:00
James Betker
be2745f42d
Add waveglow & inference capabilities to audio generator
2021-07-08 23:07:36 -06:00
James Betker
3801d5d55e
diffusion surfin'
2021-07-06 09:36:52 -06:00
James Betker
a57ed8e960
Various mods to support better jpeg image filtering
2021-06-25 13:16:15 -06:00
James Betker
e7890dc0ba
Misc fixes for diffusion nets
2021-06-21 10:38:07 -06:00
James Betker
68cbbed886
Add some cool diffusion testing scripts
2021-06-16 16:26:36 -06:00
James Betker
65c474eecf
Various changes to fix testing
2021-06-11 15:31:10 -06:00
James Betker
44b09e5f20
Amplify dropout rate
2021-06-07 15:20:53 -06:00
James Betker
eda796985b
Try out dropout norm
2021-06-07 11:33:33 -06:00
James Betker
fb405d9ef1
CIFAR stuff
...
- Extract coarse labels for the CIFAR dataset
- Add simple resnet that branches lower layers based on coarse labels
- Some other cleanup
2021-06-05 14:16:02 -06:00
James Betker
45bc76ba92
Fixes and mods to support training classifiers on imagenet
2021-06-01 17:25:24 -06:00
James Betker
f129eaa39e
Clean up byol a bit
...
- Remove option to aug in dataset (there's really no reason for this now that kornia works on GPU on windows)
- Other stufff
2021-05-24 21:35:46 -06:00
James Betker
119f17c808
Add testing capabilities for segformer & contrastive feature
2021-04-27 09:59:50 -06:00
James Betker
23e01314d4
Add dataset, ui for labeling and evaluator for pointwise classification
2021-04-23 17:17:13 -06:00
James Betker
17555e7d07
misc adjustments for stylegan
2021-04-21 18:14:17 -06:00
James Betker
b687ef4cd0
Misc
2021-04-21 18:09:46 -06:00
James Betker
94e069bced
Misc changes
2021-03-13 10:45:26 -07:00
James Betker
543d459b4e
extract_temporal_squares script
...
For extracting related patches across a video
2021-02-08 08:10:24 -07:00
James Betker
784b96c059
Misc options to add support for training stylegan2-rosinality models:
...
- Allow image_folder_dataset to normalize inbound images
- ExtensibleTrainer can denormalize images on the output path
- Support .webp - an output from LSUN
- Support logistic GAN divergence loss
- Support stylegan2 TF weight extraction for discriminator
- New injector that produces latent noise (with separated paths)
- Modify FID evaluator to be operable with rosinality-style GANs
2021-02-08 08:09:21 -07:00
James Betker
0dca36946f
Hard Routing mods
...
- Turns out my custom convolution was RIDDLED with backwards bugs, which is
why the existing implementation wasn't working so well.
- Implements the switch logic from both Mixture of Experts and Switch Transformers
for testing purposes.
2021-02-02 20:35:58 -07:00
James Betker
dac7d768fa
test uresnet playground mods
2021-01-23 13:46:43 -07:00
James Betker
557cdec116
misc
2021-01-23 13:45:17 -07:00
James Betker
d1007ccfe7
Adjustments to pixpro to allow training against networks with arbitrarily large structural latents
...
- The pixpro latent now rescales the latent space instead of using a "coordinate vector", which
**might** have performance implications.
- The latent against which the pixel loss is computed can now be a small, randomly sampled patch
out of the entire latent, allowing further memory/computational discounts. Since the loss
computation does not have a receptive field, this should not alter the loss.
- The instance projection size can now be separate from the pixel projection size.
- PixContrast removed entirely.
- ResUnet with full resolution added.
2021-01-12 09:17:45 -07:00
James Betker
34f8c8641f
Support training imagenet classifier
2021-01-11 20:09:16 -07:00
James Betker
14a868e8e6
byol playground updates
2021-01-09 20:54:21 -07:00
James Betker
41b7d50944
Update extract_square_images
2021-01-08 13:16:34 -07:00
James Betker
5a8156026a
Did anyone ask for k-means clustering?
...
This is so cool...
2021-01-07 22:37:41 -07:00
James Betker
659814c20f
BYOL script updates
2021-01-07 16:31:28 -07:00