DL-Art-School

Author	SHA1	Message	Date
James Betker	ac57cdc794	Add scheduling to quantizer, enable cudnn_benchmarking to be disabled	2021-09-24 17:01:36 -06:00
James Betker	3e64e847c2	Gumbel quantizer	2021-09-23 23:32:03 -06:00
James Betker	c5297ccec6	Add dvae balancing heuristic	2021-09-23 21:19:36 -06:00
James Betker	e24c619387	Fix	2021-09-23 16:07:58 -06:00
James Betker	6833048bf7	Alterations to diffusion_dvae so it can be used directly on spectrograms	2021-09-23 15:56:25 -06:00
James Betker	5c8d266d4f	chk	2021-09-17 09:15:36 -06:00
James Betker	a6544f1684	More checkpointing fixes	2021-09-16 23:12:43 -06:00
James Betker	94899d88f3	Fix overuse of checkpointing	2021-09-16 23:00:28 -06:00
James Betker	f78ce9d924	Get diffusion_dvae ready for prime time!	2021-09-16 22:43:10 -06:00
James Betker	6f48674647	Support diffusion models with extra return values & inference in diffusion_dvae	2021-09-16 10:53:46 -06:00
James Betker	0382660159	Get diffusion_dvae functional	2021-09-14 17:43:31 -06:00
James Betker	76e2c497f7	Improvements to splitter	2021-09-09 23:34:56 -06:00
James Betker	742f9b4010	Batch spleeter cleaner using GPU	2021-09-09 23:14:32 -06:00
James Betker	73b930c0f6	Add diffusion_dvae Increase split_on_silence interval	2021-09-09 16:22:05 -06:00
James Betker	b8f2e0f452	mydvae	2021-09-06 17:45:30 -06:00
James Betker	3e073cff85	Set kernel_size in diffusion_vocoder	2021-09-01 08:33:46 -06:00
James Betker	dabd87246d	Add unet_diffusion_vocoder	2021-08-31 14:38:33 -06:00
James Betker	909754cc27	Add find_faulty_files.py	2021-08-25 18:00:43 -06:00
James Betker	08b33c8e3a	Support silu activation	2021-08-25 09:03:14 -06:00
James Betker	67bf7f5219	dvae mods Trying to squeeze as much performance out of this net as possible	2021-08-25 08:55:13 -06:00
James Betker	b521d94b01	Make gpt-asr more configurable	2021-08-19 16:33:41 -06:00
James Betker	570ed327ed	Stop dataset - attempt #2	2021-08-18 18:29:38 -06:00
James Betker	17453ccbe8	Revert mods to lrdvae They didn't really change anything	2021-08-17 09:09:29 -06:00
James Betker	8332923f5c	Two more tools to test the audio segmentor	2021-08-17 09:09:11 -06:00
James Betker	1fede41b7b	Audio segmentor	2021-08-16 22:51:53 -06:00
James Betker	729c1fd5a9	Fix up max lengths to save memory	2021-08-15 21:29:28 -06:00
James Betker	9e47e64d5a	Add gpt_segmentor model The idea is to specifically train a model that extracts phrases from audio clips.	2021-08-15 21:23:07 -06:00
James Betker	a826d5f658	Mods to dvae - Add resblock to each layer - Increase filter size for each layer - Use SiLU	2021-08-15 20:54:10 -06:00
James Betker	b8bec22f1a	Fix gpt_asr inference bug	2021-08-15 20:53:42 -06:00
James Betker	a523c4f932	Auto-normalize wav files by data type	2021-08-15 09:09:51 -06:00
James Betker	98057b6516	Make lrdvae use quantized mode in eval()	2021-08-14 23:43:01 -06:00
James Betker	ad3391bd96	Fix nan issue when interpolating audio	2021-08-14 20:42:01 -06:00
James Betker	d6a73acaed	Allow processing of multiple audio sources at once from nv_tacotron_dataset	2021-08-14 16:04:05 -06:00
James Betker	007976082b	GPT_asr for inference	2021-08-14 14:37:17 -06:00
James Betker	e1bdd3f7c7	Fix gpt_asr bug. Initial implementation of beam search	2021-08-13 22:47:00 -06:00
James Betker	cdee31c60b	GPT_ASR	2021-08-13 15:02:18 -06:00
James Betker	f5a9b88ef6	tacotron cleaners: remove quotation marks these don't really have relevance for tts or asr	2021-08-11 16:18:44 -06:00
James Betker	20586a8edc	Fix LRDVAE bug with quantizer integration	2021-08-11 16:17:22 -06:00
James Betker	82fc69abfa	Add "pure" evaluator Which simply computes the training loss against an eval dataset	2021-08-09 14:58:35 -06:00
James Betker	080bea2f19	No, really	2021-08-09 12:02:31 -06:00
James Betker	e1ce4671e4	Apply dropout to gpt_tts, get rid of min_gpt implementation	2021-08-09 12:01:10 -06:00
James Betker	1068f53b78	Add a sampling beam search	2021-08-09 11:56:06 -06:00
James Betker	01cfae28d8	Beam search implementation in one pass? Dayyyum	2021-08-08 23:22:42 -06:00
James Betker	690d7e86d3	Fix nv_tacotron_dataset bug which incorrectly mapped filenames dammit..	2021-08-08 11:38:52 -06:00
James Betker	a2afb25e42	Fix inference, always flow full text tokens through transformer	2021-08-07 20:11:10 -06:00
James Betker	4c678172d6	ugh	2021-08-06 22:10:18 -06:00
James Betker	e723137273	Make gpttts more configurable	2021-08-06 22:08:51 -06:00
James Betker	a7496b661c	combined dvae ftw	2021-08-06 22:01:06 -06:00
James Betker	0237e96b34	Fix dvae bug	2021-08-06 14:17:01 -06:00
James Betker	0799d95af5	Use quantizer from rosinality/vqvae with openai dvae	2021-08-06 14:06:26 -06:00
James Betker	d3ace153af	Add logic for performing inference using gpt_tts with dual-encoder modes	2021-08-06 12:04:12 -06:00
James Betker	b43683b772	Add lucidrains_dvae	2021-08-06 12:03:46 -06:00
James Betker	70dcd1107f	Fix byol_model_wrapper to function with audio inputs	2021-08-05 22:20:22 -06:00
James Betker	89d15c9e74	Move gpt-tts back to lucidrains implementation Much better performance.	2021-08-05 22:15:13 -06:00
James Betker	d120e1aa99	Add audio augmentation to wavfile_dataset, utility to test audio similary	2021-08-05 22:14:49 -06:00
James Betker	c0f61a2e15	Rework how DVAE tokens are ordered It might make more sense to have top tokens, then bottom tokens with top tokens having different discretized values.	2021-08-05 07:07:17 -06:00
James Betker	4017236ba9	Fix up inference for gpt_tts	2021-08-05 06:46:30 -06:00
James Betker	5037220ac7	Mods to support contrastive learning on audio files	2021-08-05 05:57:04 -06:00
James Betker	341f28dd82	It works!	2021-08-04 20:07:51 -06:00
James Betker	36c7c1fbdb	Fix training flow for NEXT TOKEN prediction instead of same token prediction doh	2021-08-04 10:28:09 -06:00
James Betker	d9936df363	Add gpt_tts dataset and implement inference - Adds a script which preprocesses quantized mels given a DVAE - Adds a dataset which can consume preprocessed qmels - Reworks GPT TTS to consume the outputs of that dataset (removes logic to add padding and start/end tokens) - Adds inference to gpt_tts	2021-08-04 00:44:04 -06:00
James Betker	4c98b9703f	Get dalle-style TTS to "work"	2021-08-03 21:08:27 -06:00
James Betker	2814307eee	Alterations to support VQVAE on mel spectrograms	2021-08-01 07:54:21 -06:00
James Betker	0c9e75bc69	Improvements to GptTts	2021-07-31 15:57:57 -06:00
James Betker	31ee9ae262	Checkin	2021-07-30 23:07:35 -06:00
James Betker	dadc54795c	Add gpt_tts	2021-07-27 20:33:30 -06:00
James Betker	398185e109	More work on wave-diffusion	2021-07-27 05:36:17 -06:00
James Betker	49e3b310ea	Allow audio sample rate interpolation for faster training	2021-07-26 17:44:06 -06:00
James Betker	96e90e7047	Add support for a gaussian-diffusion-based wave tacotron	2021-07-26 16:27:31 -06:00
James Betker	97d7cbbc34	Additional work for audio xformer (which doesnt really do a great job)	2021-07-23 10:58:14 -06:00
James Betker	d81386c1be	Mods to support vqvae in audio mode (1d)	2021-07-20 08:36:46 -06:00
James Betker	5584cfcc7a	tacotron2 work	2021-07-14 21:41:57 -06:00
James Betker	fe0c699ced	Various fixes	2021-07-14 00:08:42 -06:00
James Betker	be2745f42d	Add waveglow & inference capabilities to audio generator	2021-07-08 23:07:36 -06:00
James Betker	1ff434218e	tacotron2, ready for prime time!	2021-07-08 22:13:44 -06:00
James Betker	86fd3ad7fd	Initial checkin of nvidia tacotron model & dataset These two are tested, full support for training to come.	2021-07-06 11:11:35 -06:00
James Betker	afa41f1804	Allow hq color jittering and corruptions that are not included in the corruption factor	2021-06-30 09:44:46 -06:00
James Betker	6fd16ea9c8	Add meta-anomaly detection, colorjitter augmentation	2021-06-29 13:41:55 -06:00
James Betker	46e9f62be0	Add unet with latent guide This is a diffusion network that uses both a LQ image and a reference sample HQ image that is compressed into a latent vector to perform upsampling The hope is that we can steer the upsampling network with sample images.	2021-06-26 11:02:58 -06:00
James Betker	0ded106562	Merge remote-tracking branch 'origin/master'	2021-06-25 13:16:28 -06:00
James Betker	a57ed8e960	Various mods to support better jpeg image filtering	2021-06-25 13:16:15 -06:00
James Betker	a0ef07ddb8	Create unet_latent_guide.py	2021-06-25 11:25:14 -06:00
James Betker	e7890dc0ba	Misc fixes for diffusion nets	2021-06-21 10:38:07 -06:00
James Betker	65c474eecf	Various changes to fix testing	2021-06-11 15:31:10 -06:00
James Betker	220f11a5e4	Half channel sizes in cifar_resnet	2021-06-09 17:06:37 -06:00
James Betker	9b5f4abb91	Add fade in for hard switch	2021-06-07 18:15:09 -06:00
James Betker	108c5d829c	Fix dropout norm	2021-06-07 16:13:23 -06:00
James Betker	438217094c	Also debug distribution of switch	2021-06-07 15:36:07 -06:00
James Betker	44b09e5f20	Amplify dropout rate	2021-06-07 15:20:53 -06:00
James Betker	f0d4eb9182	Fixor	2021-06-07 11:58:36 -06:00
James Betker	c456a60466	Another go at fixing nan	2021-06-07 11:51:43 -06:00
James Betker	1c574c5bd1	Attempt to fix nan	2021-06-07 11:43:42 -06:00
James Betker	eda796985b	Try out dropout norm	2021-06-07 11:33:33 -06:00
James Betker	6c6e82406e	Pass a corruption factor through the dataset into the upsampling network The intuition is this will help guide the network to make better informed decisions about how it performs upsampling based on how it perceives the underlying content. (I'm giving up on letting networks detect their own quality - I'm not convinced it is actually feasible)	2021-06-07 09:13:54 -06:00
James Betker	061dbcd458	Another fix to anorm	2021-06-06 15:09:49 -06:00
James Betker	9a6991e461	Fix switch norm average	2021-06-06 15:04:28 -06:00
James Betker	57e1a6a0f2	cifar: add hard routing Also mods switched_routing to support non-pixular inputs	2021-06-06 14:53:43 -06:00
James Betker	692e9c417b	Support diffusion unet	2021-06-06 13:57:22 -06:00
James Betker	a0158ebc69	Simplify cifar resnet further for faster training	2021-06-06 10:02:24 -06:00
James Betker	75567a9814	Only head norm removed	2021-06-05 23:29:11 -06:00
James Betker	65d0376b90	Re-add normalization at the tail of the RRDB	2021-06-05 23:04:05 -06:00
James Betker	184e887122	Remove rrdb normalization	2021-06-05 21:39:19 -06:00
James Betker	f5e75602b9	Add regular attention to cifar_resnet	2021-06-05 21:34:07 -06:00
James Betker	af52751d6b	Fix device error	2021-06-05 14:21:32 -06:00
James Betker	5f0cc65f3b	Register branched resnet properly	2021-06-05 14:19:03 -06:00
James Betker	fb405d9ef1	CIFAR stuff - Extract coarse labels for the CIFAR dataset - Add simple resnet that branches lower layers based on coarse labels - Some other cleanup	2021-06-05 14:16:02 -06:00
James Betker	80d4404367	A few fixes: - Output better prediction of xstart from eps - Support LossAwareSampler - Support AdamW	2021-06-05 13:40:32 -06:00
James Betker	7c251af7a8	Support cifar100 with resnet	2021-06-04 17:29:07 -06:00
James Betker	bf811f80c1	GD mods & fixes - Report variational loss separately - Report model prediction from injector - Log these things - Use respacing like guided diffusion	2021-06-04 17:13:16 -06:00
James Betker	6084915af8	Support gaussian diffusion models Adds support for GD models, courtesy of some maths from openai. Also: - Fixes requirement for eval{} even when it isn't being used - Adds support for denormalizing an imagenet norm	2021-06-02 21:47:32 -06:00
James Betker	f129eaa39e	Clean up byol a bit - Remove option to aug in dataset (there's really no reason for this now that kornia works on GPU on windows) - Other stufff	2021-05-24 21:35:46 -06:00
James Betker	1a2b9fa130	Get rid of old byol net wrapping Simplifies and makes this usable with DLAS' multi-gpu trainer	2021-04-27 12:48:34 -06:00
James Betker	119f17c808	Add testing capabilities for segformer & contrastive feature	2021-04-27 09:59:50 -06:00
James Betker	9bbe6fc81e	Get segformer to a trainable state	2021-04-25 11:45:20 -06:00
James Betker	fc623d4b5a	Add segformer model. Start work on BYOL adaptation that will support training it.	2021-04-23 17:16:46 -06:00
James Betker	17555e7d07	misc adjustments for stylegan	2021-04-21 18:14:17 -06:00
James Betker	b687ef4cd0	Misc	2021-04-21 18:09:46 -06:00
James Betker	9fc3df3f5b	Switched conv: add conversion function with allowlist	2021-03-13 10:44:56 -07:00
James Betker	cf9a6da889	Fix some bugs, checkin work on vqvae3	2021-03-02 20:56:19 -07:00
James Betker	f89ea5f1c6	Mods to support lightweight_gan model	2021-03-02 20:51:48 -07:00
James Betker	39fd755baa	New benchmark numbers	2021-02-08 08:09:41 -07:00
James Betker	784b96c059	Misc options to add support for training stylegan2-rosinality models: - Allow image_folder_dataset to normalize inbound images - ExtensibleTrainer can denormalize images on the output path - Support .webp - an output from LSUN - Support logistic GAN divergence loss - Support stylegan2 TF weight extraction for discriminator - New injector that produces latent noise (with separated paths) - Modify FID evaluator to be operable with rosinality-style GANs	2021-02-08 08:09:21 -07:00
James Betker	e7be4bdff3	Revert	2021-02-05 08:43:07 -07:00
James Betker	6dec1f5968	Back to groupnorm	2021-02-05 08:42:11 -07:00
James Betker	336f807c8e	lambda2	2021-02-05 00:00:24 -07:00
James Betker	025a5867c4	Use syncbatchnorm instead	2021-02-04 22:26:36 -07:00
James Betker	bb79fafb89	Fix groupnorm specification	2021-02-04 22:15:38 -07:00
James Betker	43da1f9c4b	Convert lambda coupler to use groupnorm instead of batchnorm	2021-02-04 21:59:44 -07:00
James Betker	7070142805	Make vqvae3_hard more configurable	2021-02-04 09:03:22 -07:00
James Betker	b980028ca8	Add get_debug_values for vqvae_3_hardswitch	2021-02-03 14:12:24 -07:00
James Betker	1405ff06b8	Fix SwitchedConvHardRoutingFunction for current cuda router	2021-02-03 14:11:55 -07:00
James Betker	d7bec392dd	...	2021-02-02 23:50:25 -07:00
James Betker	b0a8fa00bc	Visual dbg in vqvae3hs	2021-02-02 23:50:01 -07:00
James Betker	f5f91850fd	hardswitch variant of vqvae3	2021-02-02 21:00:04 -07:00
James Betker	320edbaa3c	Move switched_conv logic around a bit	2021-02-02 20:41:24 -07:00
James Betker	0dca36946f	Hard Routing mods - Turns out my custom convolution was RIDDLED with backwards bugs, which is why the existing implementation wasn't working so well. - Implements the switch logic from both Mixture of Experts and Switch Transformers for testing purposes.	2021-02-02 20:35:58 -07:00
James Betker	29c1c3bede	Register vqvae3	2021-01-29 15:26:28 -07:00
James Betker	bc20b4739e	vqvae3 Changes VQVAE as so: - Reverts back to smaller codebook - Adds an additional conv layer at the highest resolution for both the encoder & decoder - Uses LeakyReLU on trunk	2021-01-29 15:24:26 -07:00
James Betker	96bc80313c	Add switch norm, up dropout rate, detach selector	2021-01-26 09:31:53 -07:00
James Betker	2cdac6bd09	Add PWCNet for human optical flow	2021-01-25 08:25:44 -07:00
James Betker	51b63b2aa6	Add switched_conv with hard routing and make vqvae use it.	2021-01-25 08:25:29 -07:00
James Betker	ae4ff4a1e7	Enable lambda visualization	2021-01-23 15:53:27 -07:00
James Betker	10ec6bda1d	lambda nets in switched_conv and a vqvae to use it	2021-01-23 14:57:57 -07:00
James Betker	b374dcdd46	update vqvae to double codebook size for bottom quantizer	2021-01-23 13:47:07 -07:00
James Betker	1b8a26db93	New switched_conv	2021-01-23 13:46:30 -07:00
James Betker	d919ae7148	Add VQVAE with no Conv2dTranspose	2021-01-18 08:49:59 -07:00
James Betker	587a4f4050	resnet_unet_3 I'm being really lazy here - these nets are not really different from each other except at which layer they terminate. This one terminates at 2x downsampling, which is simply indicative of a direction I want to go for testing these pixpro networks.	2021-01-15 14:51:03 -07:00
James Betker	038b8654b6	Pixpro: unwrap losses	2021-01-13 11:54:25 -07:00
James Betker	8990801a3f	Fix pixpro stochastic sampling bugs	2021-01-13 11:34:24 -07:00
James Betker	19475a072f	Pixpro: Rather than using a latent square for pixpro, use an entirely stochastic sampling of the pixels	2021-01-13 11:26:51 -07:00

1 2 3 4 5 ...

917 Commits