DL-Art-School

Author	SHA1	Message	Date
James Betker	5d714bc566	Add deepspeech model and support for decoding with it	2021-10-27 13:09:46 -06:00
James Betker	21b6daa0ed	Introduce clip resampling	2021-10-26 10:42:23 -06:00
James Betker	c3421b7f6d	Dataset work for audio quality processor	2021-10-24 09:09:34 -06:00
James Betker	06ea6191a9	Initial implementation of audio_with_noise dataset	2021-10-21 16:45:19 -06:00
James Betker	d016a2fbad	Go back to vanilla flavor of diffusion	2021-10-17 17:32:46 -06:00
James Betker	6833048bf7	Alterations to diffusion_dvae so it can be used directly on spectrograms	2021-09-23 15:56:25 -06:00
James Betker	359e9e27a7	unsupervised_audio_dataset: try to recover from failures of audio2numpy	2021-09-17 15:25:57 -06:00
James Betker	f78ce9d924	Get diffusion_dvae ready for prime time!	2021-09-16 22:43:10 -06:00
James Betker	1197ae1928	Misc	2021-09-16 10:53:56 -06:00
James Betker	8d9857f33d	More fixes	2021-09-14 20:45:05 -06:00
James Betker	9a9c90660f	Fixes	2021-09-14 18:29:17 -06:00
James Betker	e513052fca	Add unsupervised_audio_dataset	2021-09-14 17:43:16 -06:00
James Betker	b8f2e0f452	mydvae	2021-09-06 17:45:30 -06:00
James Betker	30cd33fe44	another fix	2021-08-31 14:46:46 -06:00
James Betker	8810d3de97	fix wavfile_dataset	2021-08-31 14:45:29 -06:00
James Betker	dabd87246d	Add unet_diffusion_vocoder	2021-08-31 14:38:33 -06:00
James Betker	570ed327ed	Stop dataset - attempt #2	2021-08-18 18:29:38 -06:00
James Betker	8332923f5c	Two more tools to test the audio segmentor	2021-08-17 09:09:11 -06:00
James Betker	93e903af15	Rework wavfile dataset to be usable for things other than augments	2021-08-16 22:52:35 -06:00
James Betker	d7f30232c3	Oh yeah	2021-08-16 22:52:15 -06:00
James Betker	4c01d82265	Fix for voxpopuli	2021-08-16 22:52:05 -06:00
James Betker	1fede41b7b	Audio segmentor	2021-08-16 22:51:53 -06:00
James Betker	2d3372054d	Add support for voxpopuli to nv_tacotron_dataset	2021-08-16 17:13:40 -06:00
James Betker	3580c52eac	Fix up wavfile_dataset to be able to provide a full clip	2021-08-15 20:53:26 -06:00
James Betker	a523c4f932	Auto-normalize wav files by data type	2021-08-15 09:09:51 -06:00
James Betker	c28f657ab8	Allow usage of pre-rendered mels saved to npy files	2021-08-14 23:38:15 -06:00
James Betker	ad3391bd96	Fix nan issue when interpolating audio	2021-08-14 20:42:01 -06:00
James Betker	769f0acc53	Moar fix	2021-08-14 17:23:15 -06:00
James Betker	3d2e724083	Fix audio ranging problem	2021-08-14 17:18:55 -06:00
James Betker	d6a73acaed	Allow processing of multiple audio sources at once from nv_tacotron_dataset	2021-08-14 16:04:05 -06:00
James Betker	007976082b	GPT_asr for inference	2021-08-14 14:37:17 -06:00
James Betker	72622b4d61	Allow saving mel strips as files from the dataset implementation	2021-08-13 22:46:41 -06:00
James Betker	cfd284f425	Fix up some stuff that allows the MEL to be computed on-GPU	2021-08-13 18:35:55 -06:00
James Betker	fff1a59e08	max/min mel invalid fix	2021-08-13 09:36:31 -06:00
James Betker	4b2946e581	More fix	2021-08-12 15:51:23 -06:00
James Betker	4c76257c71	Dont require collation for nv_tacotron	2021-08-12 15:44:55 -06:00
James Betker	5b07d3b623	Found error that I was trying to fix with reload=True	2021-08-12 15:22:34 -06:00
James Betker	430b650a34	......	2021-08-12 10:31:10 -06:00
James Betker	b35d6ae028	Print some metrics from tacotron dataset when it croaks	2021-08-12 09:21:12 -06:00
James Betker	0c4d6b1916	Just offer generic re-load for nv-tacotron	2021-08-12 09:09:12 -06:00
James Betker	154f5aa73c	Fix annoying warning and add to requirements	2021-08-11 17:32:06 -06:00
James Betker	f04a7bdf63	Bug fixes for tacotron dataset on mozilla cv - Support a max mel length (mozilla cv has some tracks that are basically unbounded..) - Don't fail on low sample rates (mozilla cv has some of those)	2021-08-11 16:17:03 -06:00
James Betker	2d3f0cc33c	nv_tacotron_dataset - Allow training on mozilla cv	2021-08-11 13:34:31 -06:00
James Betker	d0c74278bf	Enable multiple wavfile paths to be specified, fix eps bug in mp3 splitter	2021-08-11 08:46:02 -06:00
James Betker	e19c00398e	More improvements to random_mp3_splitter	2021-08-09 21:31:12 -06:00
James Betker	74342b860b	Revert "Undo forced text padding" This reverts commit `83ab5e6a00`.	2021-08-09 11:56:34 -06:00
James Betker	d4e33bf15f	Fixes to the mp3 splitter	2021-08-09 11:55:46 -06:00
James Betker	4100469902	Add a tool to split mp3 files into arbitrary chunks of wav files	2021-08-08 23:23:13 -06:00
James Betker	83ab5e6a00	Undo forced text padding	2021-08-08 11:42:20 -06:00
James Betker	690d7e86d3	Fix nv_tacotron_dataset bug which incorrectly mapped filenames dammit..	2021-08-08 11:38:52 -06:00
James Betker	a2afb25e42	Fix inference, always flow full text tokens through transformer	2021-08-07 20:11:10 -06:00
James Betker	b43683b772	Add lucidrains_dvae	2021-08-06 12:03:46 -06:00
James Betker	62c7570512	Constrain wav_aug a bit more	2021-08-06 08:19:38 -06:00
James Betker	f126040da2	Undo noise first	2021-08-05 23:24:38 -06:00
James Betker	908ef5495f	Add noise first to audio_aug	2021-08-05 23:22:44 -06:00
James Betker	d6007c6de1	dataset fixes	2021-08-05 23:12:59 -06:00
James Betker	d120e1aa99	Add audio augmentation to wavfile_dataset, utility to test audio similary	2021-08-05 22:14:49 -06:00
James Betker	4017236ba9	Fix up inference for gpt_tts	2021-08-05 06:46:30 -06:00
James Betker	5037220ac7	Mods to support contrastive learning on audio files	2021-08-05 05:57:04 -06:00
James Betker	341f28dd82	It works!	2021-08-04 20:07:51 -06:00
James Betker	d9936df363	Add gpt_tts dataset and implement inference - Adds a script which preprocesses quantized mels given a DVAE - Adds a dataset which can consume preprocessed qmels - Reworks GPT TTS to consume the outputs of that dataset (removes logic to add padding and start/end tokens) - Adds inference to gpt_tts	2021-08-04 00:44:04 -06:00
James Betker	dadc54795c	Add gpt_tts	2021-07-27 20:33:30 -06:00
James Betker	49e3b310ea	Allow audio sample rate interpolation for faster training	2021-07-26 17:44:06 -06:00
James Betker	96e90e7047	Add support for a gaussian-diffusion-based wave tacotron	2021-07-26 16:27:31 -06:00
James Betker	d81386c1be	Mods to support vqvae in audio mode (1d)	2021-07-20 08:36:46 -06:00
James Betker	1ff434218e	tacotron2, ready for prime time!	2021-07-08 22:13:44 -06:00
James Betker	86fd3ad7fd	Initial checkin of nvidia tacotron model & dataset These two are tested, full support for training to come.	2021-07-06 11:11:35 -06:00
James Betker	afa41f1804	Allow hq color jittering and corruptions that are not included in the corruption factor	2021-06-30 09:44:46 -06:00
James Betker	6fd16ea9c8	Add meta-anomaly detection, colorjitter augmentation	2021-06-29 13:41:55 -06:00
James Betker	46e9f62be0	Add unet with latent guide This is a diffusion network that uses both a LQ image and a reference sample HQ image that is compressed into a latent vector to perform upsampling The hope is that we can steer the upsampling network with sample images.	2021-06-26 11:02:58 -06:00
James Betker	0ded106562	Merge remote-tracking branch 'origin/master'	2021-06-25 13:16:28 -06:00
James Betker	a57ed8e960	Various mods to support better jpeg image filtering	2021-06-25 13:16:15 -06:00
James Betker	61e7ca39cd	Update image_folder_dataset.py	2021-06-25 11:48:31 -06:00
James Betker	6b32c87dcb	Try to make diffusion fid more deterministic	2021-06-14 09:27:43 -06:00
James Betker	65c474eecf	Various changes to fix testing	2021-06-11 15:31:10 -06:00
James Betker	6c6e82406e	Pass a corruption factor through the dataset into the upsampling network The intuition is this will help guide the network to make better informed decisions about how it performs upsampling based on how it perceives the underlying content. (I'm giving up on letting networks detect their own quality - I'm not convinced it is actually feasible)	2021-06-07 09:13:54 -06:00
James Betker	fb405d9ef1	CIFAR stuff - Extract coarse labels for the CIFAR dataset - Add simple resnet that branches lower layers based on coarse labels - Some other cleanup	2021-06-05 14:16:02 -06:00
James Betker	e6c537824a	Allow validation for ce	2021-06-04 21:21:04 -06:00
James Betker	7c251af7a8	Support cifar100 with resnet	2021-06-04 17:29:07 -06:00
James Betker	6084915af8	Support gaussian diffusion models Adds support for GD models, courtesy of some maths from openai. Also: - Fixes requirement for eval{} even when it isn't being used - Adds support for denormalizing an imagenet norm	2021-06-02 21:47:32 -06:00
James Betker	45bc76ba92	Fixes and mods to support training classifiers on imagenet	2021-06-01 17:25:24 -06:00
James Betker	6649ef2dae	Add zipfilesdataset	2021-05-24 21:35:00 -06:00
James Betker	9bbe6fc81e	Get segformer to a trainable state	2021-04-25 11:45:20 -06:00
James Betker	23e01314d4	Add dataset, ui for labeling and evaluator for pointwise classification	2021-04-23 17:17:13 -06:00
James Betker	b687ef4cd0	Misc	2021-04-21 18:09:46 -06:00
James Betker	f89ea5f1c6	Mods to support lightweight_gan model	2021-03-02 20:51:48 -07:00
James Betker	784b96c059	Misc options to add support for training stylegan2-rosinality models: - Allow image_folder_dataset to normalize inbound images - ExtensibleTrainer can denormalize images on the output path - Support .webp - an output from LSUN - Support logistic GAN divergence loss - Support stylegan2 TF weight extraction for discriminator - New injector that produces latent noise (with separated paths) - Modify FID evaluator to be operable with rosinality-style GANs	2021-02-08 08:09:21 -07:00
James Betker	34f8c8641f	Support training imagenet classifier	2021-01-11 20:09:16 -07:00
James Betker	4119cd6240	Fix to image_folder_dataset to accomodate images with mismatched dimensions	2021-01-10 12:57:21 -07:00
James Betker	5e7ade0114	ImageFolderDataset - corrupt lq images alongside each other	2021-01-03 16:36:38 -07:00
James Betker	193cdc6636	Move discriminators to the create_model paradigm Also cleans up a lot of old discriminator models that I have no intention of using again.	2021-01-01 15:56:09 -07:00
James Betker	1de1fa30ac	Disable refs and centers altogether in single_image_dataset I suspect that this might be a cause of failures on parallel datasets. Plus it is unnecessary computation.	2020-12-31 10:13:24 -07:00
James Betker	ba543d1152	Glean mods - Fixes fixed upscale factor issues - Refines a few ops to decrease computation & parameterization	2020-12-27 12:25:06 -07:00
James Betker	2706a84f15	Merge remote-tracking branch 'origin/gan_lab' into gan_lab	2020-12-26 13:50:34 -07:00
James Betker	90e2362c00	Fix bug with full_image_dataset	2020-12-26 13:50:27 -07:00
James Betker	3fd627fc62	Mods to support image classification & filtering	2020-12-26 13:49:27 -07:00
James Betker	1bbcb96ee8	Implement a few changes to support training BYOL networks	2020-12-23 10:50:23 -07:00
James Betker	e7aeb17404	ImageFolder dataset: allow intermediary downscale before corrupt For massive upscales (ex: 8x), corruption does almost nothing when applied at the HQ level. This patch adds support to perform corruption at a specified intermediary scale. The dataset downscales to this level, performs the corruption, then downscales the rest of the way to get the LQ image.	2020-12-22 15:42:21 -07:00
James Betker	7938f9f50b	Fix bug with single_image_dataset which prevented working on multiple directories from working	2020-12-19 15:13:46 -07:00
James Betker	d875ca8342	More refactor changes	2020-12-18 09:24:31 -07:00

1 2 3 4 5 ...

287 Commits