DL-Art-School

Author	SHA1	Message	Date
James Betker	53858b2055	Fix gpt_tts_hf inference	2021-12-20 17:45:26 -07:00
James Betker	b4ddcd7111	More inference improvements	2021-12-19 09:01:19 -07:00
James Betker	f9c45d70f0	Fix mel terminator	2021-12-18 17:18:06 -07:00
James Betker	937045cb63	Fixes	2021-12-18 16:45:38 -07:00
James Betker	dee34f096c	Add use_gpt_tts script	2021-12-16 23:28:54 -07:00
James Betker	62c8ed9a29	move speech utils	2021-12-16 20:47:37 -07:00
James Betker	aa7cfd1edf	Add support for mel norms across the channel dim	2021-12-12 19:52:08 -07:00
James Betker	63bf135b93	Support norms	2021-12-11 08:30:49 -07:00
James Betker	959979086d	fix	2021-12-11 08:18:00 -07:00
James Betker	5a664aa56e	misc	2021-12-11 08:17:26 -07:00
James Betker	d610540ce5	mel norm computation script	2021-12-11 08:16:50 -07:00
James Betker	b2d8fbcfc0	build a better speech synthesis toolset	2021-12-09 22:59:56 -07:00
James Betker	32cfcf3684	Turn off optimization in find_faulty_files	2021-12-09 09:02:09 -07:00
James Betker	a66a2bf91b	Update find_faulty_files	2021-12-09 09:00:00 -07:00
James Betker	04454ee63a	Add evaluation logic for gpt_asr_hf2	2021-12-02 21:04:36 -07:00
James Betker	82d0e7720e	Add choke to lucidrains_dvae	2021-11-23 18:53:37 -07:00
James Betker	973f47c525	misc nonfunctional	2021-11-22 17:16:39 -07:00
James Betker	3125ca38f5	Further wandb logs	2021-11-22 16:40:19 -07:00
James Betker	687e0746b3	Add Torch-derived MelSpectrogramInjector	2021-11-18 20:02:45 -07:00
James Betker	c30a38cdf1	Undo baseline GDI changes	2021-11-18 20:02:09 -07:00
James Betker	9b693b0a54	Fixes to filter_clips_hifreq	2021-11-07 18:42:22 -07:00
James Betker	a367ea3fda	Add script for computing attention for gpt_asr	2021-11-07 18:42:06 -07:00
James Betker	3c0f2fbb21	Add filtration script for finding resampled clips (or phone calls)	2021-11-07 14:16:11 -07:00
James Betker	756b4dad09	Working gpt_asr_hf inference - and it's a beast!	2021-11-06 21:47:15 -06:00
James Betker	596a62fe01	Apply fix to gpt_asr_hf and prep it for inference Fix is that we were predicting two characters in advance, not next character	2021-11-04 10:09:24 -06:00
James Betker	36ed28913a	Fix two scripts	2021-10-30 17:00:06 -06:00
James Betker	466b9fbcaa	classify	2021-10-29 20:22:40 -06:00
James Betker	986fc9628d	Check in GPT with new inference methods (but not the backing code..)	2021-10-29 17:21:40 -06:00
James Betker	579f0a70ee	Move UnsupervisedAudioDataset to use my new mp3 loader	2021-10-28 22:33:12 -06:00
James Betker	bb0a0c8264	classify_into_folders script	2021-10-27 14:56:16 -06:00
James Betker	d91dcbd404	Make classifier inference script more open	2021-10-27 13:18:54 -06:00
James Betker	5d714bc566	Add deepspeech model and support for decoding with it	2021-10-27 13:09:46 -06:00
James Betker	15437b2fc3	WER script	2021-10-26 13:30:29 -06:00
James Betker	ba6e46c02a	Further simplify diffusion_vocoder and make noise_surfer work	2021-10-26 08:54:30 -06:00
James Betker	f2a31702b5	Clean stuff up, move more things into arch_util	2021-10-20 21:19:25 -06:00
James Betker	d016a2fbad	Go back to vanilla flavor of diffusion	2021-10-17 17:32:46 -06:00
James Betker	c861054218	Restore spleeter_splitter The mods don't help - in TF mode, everything is done on the GPU anyways. Something else is going to have to be done to fix this.	2021-10-09 23:55:42 -06:00
James Betker	32ba496632	More fixes	2021-10-09 23:27:14 -06:00
James Betker	932ea29a83	Add multiprocessing to the spleeter splitter script to try and improve performance further	2021-10-09 23:15:36 -06:00
James Betker	b94e587f46	Improvements to spleeter_filter_noisy_clips	2021-10-07 21:28:00 -06:00
James Betker	bb891a3a53	Add partitioning and improved resuming to the spleeter filtering	2021-10-06 17:10:12 -06:00
James Betker	4914c526dc	More cleanup	2021-09-29 14:24:49 -06:00
James Betker	fc8ae4679a	Work on spleeter filtering script	2021-09-29 09:24:56 -06:00
James Betker	55b58fb67f	Clean up codebase Remove stuff that I'm likely not going to use again (or generally failed experiments)	2021-09-29 09:21:44 -06:00
James Betker	ac57cdc794	Add scheduling to quantizer, enable cudnn_benchmarking to be disabled	2021-09-24 17:01:36 -06:00
James Betker	6833048bf7	Alterations to diffusion_dvae so it can be used directly on spectrograms	2021-09-23 15:56:25 -06:00
James Betker	97ea329a59	Make spleeter filter simpler (and hopefully much faster)	2021-09-17 15:29:42 -06:00
James Betker	f78ce9d924	Get diffusion_dvae ready for prime time!	2021-09-16 22:43:10 -06:00
James Betker	1197ae1928	Misc	2021-09-16 10:53:56 -06:00
James Betker	4334a67924	Spleeter mods	2021-09-14 17:43:40 -06:00
James Betker	bc603c3231	Script adjustments and fixes	2021-09-12 21:26:45 -06:00
James Betker	76e2c497f7	Improvements to splitter	2021-09-09 23:34:56 -06:00
James Betker	742f9b4010	Batch spleeter cleaner using GPU	2021-09-09 23:14:32 -06:00
James Betker	73b930c0f6	Add diffusion_dvae Increase split_on_silence interval	2021-09-09 16:22:05 -06:00
James Betker	b8f2e0f452	mydvae	2021-09-06 17:45:30 -06:00
James Betker	92e7e57f81	Update diffusion_noise_surfer to support audio	2021-09-01 08:34:47 -06:00
James Betker	274d352e6f	dug	2021-08-30 21:45:58 -06:00
James Betker	f1a0c21fb2	asr_eval	2021-08-30 21:41:34 -06:00
James Betker	ed6eae407f	More scripts for splitting and formatting audio	2021-08-30 21:20:52 -06:00
James Betker	909754cc27	Add find_faulty_files.py	2021-08-25 18:00:43 -06:00
James Betker	d05cc1f46c	Misc	2021-08-24 17:12:04 -06:00
James Betker	b521d94b01	Make gpt-asr more configurable	2021-08-19 16:33:41 -06:00
James Betker	570ed327ed	Stop dataset - attempt #2	2021-08-18 18:29:38 -06:00
James Betker	8332923f5c	Two more tools to test the audio segmentor	2021-08-17 09:09:11 -06:00
James Betker	7c086d0c2c	libritts - only write on successful check	2021-08-16 22:52:55 -06:00
James Betker	1fede41b7b	Audio segmentor	2021-08-16 22:51:53 -06:00
James Betker	3580c52eac	Fix up wavfile_dataset to be able to provide a full clip	2021-08-15 20:53:26 -06:00
James Betker	a523c4f932	Auto-normalize wav files by data type	2021-08-15 09:09:51 -06:00
James Betker	c28f657ab8	Allow usage of pre-rendered mels saved to npy files	2021-08-14 23:38:15 -06:00
James Betker	d6a73acaed	Allow processing of multiple audio sources at once from nv_tacotron_dataset	2021-08-14 16:04:05 -06:00
James Betker	007976082b	GPT_asr for inference	2021-08-14 14:37:17 -06:00
James Betker	81e91c99de	Misc	2021-08-13 13:58:59 -06:00
James Betker	d0c74278bf	Enable multiple wavfile paths to be specified, fix eps bug in mp3 splitter	2021-08-11 08:46:02 -06:00
James Betker	e19c00398e	More improvements to random_mp3_splitter	2021-08-09 21:31:12 -06:00
James Betker	4100469902	Add a tool to split mp3 files into arbitrary chunks of wav files	2021-08-08 23:23:13 -06:00
James Betker	690d7e86d3	Fix nv_tacotron_dataset bug which incorrectly mapped filenames dammit..	2021-08-08 11:38:52 -06:00
James Betker	a2afb25e42	Fix inference, always flow full text tokens through transformer	2021-08-07 20:11:10 -06:00
James Betker	a7496b661c	combined dvae ftw	2021-08-06 22:01:06 -06:00
James Betker	b43683b772	Add lucidrains_dvae	2021-08-06 12:03:46 -06:00
James Betker	62c7570512	Constrain wav_aug a bit more	2021-08-06 08:19:38 -06:00
James Betker	f86df53ce0	Export extract_byol_model as a function	2021-08-05 22:15:26 -06:00
James Betker	d120e1aa99	Add audio augmentation to wavfile_dataset, utility to test audio similary	2021-08-05 22:14:49 -06:00
James Betker	c0f61a2e15	Rework how DVAE tokens are ordered It might make more sense to have top tokens, then bottom tokens with top tokens having different discretized values.	2021-08-05 07:07:17 -06:00
James Betker	36c7c1fbdb	Fix training flow for NEXT TOKEN prediction instead of same token prediction doh	2021-08-04 10:28:09 -06:00
James Betker	d9936df363	Add gpt_tts dataset and implement inference - Adds a script which preprocesses quantized mels given a DVAE - Adds a dataset which can consume preprocessed qmels - Reworks GPT TTS to consume the outputs of that dataset (removes logic to add padding and start/end tokens) - Adds inference to gpt_tts	2021-08-04 00:44:04 -06:00
James Betker	4c98b9703f	Get dalle-style TTS to "work"	2021-08-03 21:08:27 -06:00
James Betker	0c9e75bc69	Improvements to GptTts	2021-07-31 15:57:57 -06:00
James Betker	31ee9ae262	Checkin	2021-07-30 23:07:35 -06:00
James Betker	2325e7a88c	Allow inference for vqvae	2021-07-20 10:40:05 -06:00
James Betker	be2745f42d	Add waveglow & inference capabilities to audio generator	2021-07-08 23:07:36 -06:00
James Betker	3801d5d55e	diffusion surfin'	2021-07-06 09:36:52 -06:00
James Betker	a57ed8e960	Various mods to support better jpeg image filtering	2021-06-25 13:16:15 -06:00
James Betker	e7890dc0ba	Misc fixes for diffusion nets	2021-06-21 10:38:07 -06:00
James Betker	68cbbed886	Add some cool diffusion testing scripts	2021-06-16 16:26:36 -06:00
James Betker	65c474eecf	Various changes to fix testing	2021-06-11 15:31:10 -06:00
James Betker	44b09e5f20	Amplify dropout rate	2021-06-07 15:20:53 -06:00
James Betker	eda796985b	Try out dropout norm	2021-06-07 11:33:33 -06:00
James Betker	fb405d9ef1	CIFAR stuff - Extract coarse labels for the CIFAR dataset - Add simple resnet that branches lower layers based on coarse labels - Some other cleanup	2021-06-05 14:16:02 -06:00
James Betker	45bc76ba92	Fixes and mods to support training classifiers on imagenet	2021-06-01 17:25:24 -06:00
James Betker	f129eaa39e	Clean up byol a bit - Remove option to aug in dataset (there's really no reason for this now that kornia works on GPU on windows) - Other stufff	2021-05-24 21:35:46 -06:00
James Betker	119f17c808	Add testing capabilities for segformer & contrastive feature	2021-04-27 09:59:50 -06:00
James Betker	23e01314d4	Add dataset, ui for labeling and evaluator for pointwise classification	2021-04-23 17:17:13 -06:00
James Betker	17555e7d07	misc adjustments for stylegan	2021-04-21 18:14:17 -06:00
James Betker	b687ef4cd0	Misc	2021-04-21 18:09:46 -06:00
James Betker	94e069bced	Misc changes	2021-03-13 10:45:26 -07:00
James Betker	543d459b4e	extract_temporal_squares script For extracting related patches across a video	2021-02-08 08:10:24 -07:00
James Betker	784b96c059	Misc options to add support for training stylegan2-rosinality models: - Allow image_folder_dataset to normalize inbound images - ExtensibleTrainer can denormalize images on the output path - Support .webp - an output from LSUN - Support logistic GAN divergence loss - Support stylegan2 TF weight extraction for discriminator - New injector that produces latent noise (with separated paths) - Modify FID evaluator to be operable with rosinality-style GANs	2021-02-08 08:09:21 -07:00
James Betker	0dca36946f	Hard Routing mods - Turns out my custom convolution was RIDDLED with backwards bugs, which is why the existing implementation wasn't working so well. - Implements the switch logic from both Mixture of Experts and Switch Transformers for testing purposes.	2021-02-02 20:35:58 -07:00
James Betker	dac7d768fa	test uresnet playground mods	2021-01-23 13:46:43 -07:00
James Betker	557cdec116	misc	2021-01-23 13:45:17 -07:00
James Betker	d1007ccfe7	Adjustments to pixpro to allow training against networks with arbitrarily large structural latents - The pixpro latent now rescales the latent space instead of using a "coordinate vector", which might have performance implications. - The latent against which the pixel loss is computed can now be a small, randomly sampled patch out of the entire latent, allowing further memory/computational discounts. Since the loss computation does not have a receptive field, this should not alter the loss. - The instance projection size can now be separate from the pixel projection size. - PixContrast removed entirely. - ResUnet with full resolution added.	2021-01-12 09:17:45 -07:00
James Betker	34f8c8641f	Support training imagenet classifier	2021-01-11 20:09:16 -07:00
James Betker	14a868e8e6	byol playground updates	2021-01-09 20:54:21 -07:00
James Betker	41b7d50944	Update extract_square_images	2021-01-08 13:16:34 -07:00
James Betker	5a8156026a	Did anyone ask for k-means clustering? This is so cool...	2021-01-07 22:37:41 -07:00
James Betker	659814c20f	BYOL script updates	2021-01-07 16:31:28 -07:00
James Betker	61a86a3c1e	VQVAE	2021-01-07 10:20:15 -07:00
James Betker	9680294430	Move byol scripts around	2021-01-06 14:52:17 -07:00
James Betker	9fed90393f	Add lucidrains pixpro trainer	2021-01-05 20:14:22 -07:00
James Betker	39a94c74b5	Allow BYOL resnet playground to produce a latent dict	2021-01-04 20:11:29 -07:00
James Betker	ade2732c82	Transfer learning for styleSR This is a concept from "Lifelong Learning GAN", although I'm skeptical of it's novelty - basically you scale and shift the weights for the generator and discriminator of a pretrained GAN to "shift" into new modalities, e.g. faces->birds or whatever. There are some interesting applications of this that I would like to try out.	2021-01-04 20:10:48 -07:00
James Betker	4d8064c32c	Modifications to allow partially trained stylegan discriminators to be used	2021-01-03 16:37:18 -07:00
James Betker	193cdc6636	Move discriminators to the create_model paradigm Also cleans up a lot of old discriminator models that I have no intention of using again.	2021-01-01 15:56:09 -07:00
James Betker	aae65e6ed8	Mods to byol_resnet_playground for large batches	2021-01-01 11:59:54 -07:00
James Betker	8de5a02a48	byol_resnet_playground Similar to the spinenet playground, but tinkers with resnet instead	2020-12-31 10:15:04 -07:00
James Betker	9dc3c8f0ff	Script updates	2020-12-29 20:24:41 -07:00
James Betker	3fd627fc62	Mods to support image classification & filtering	2020-12-26 13:49:27 -07:00
James Betker	1bbcb96ee8	Implement a few changes to support training BYOL networks	2020-12-23 10:50:23 -07:00
James Betker	2437b33e74	Fix srflow_latent_space_playground bug	2020-12-22 15:42:38 -07:00
James Betker	7938f9f50b	Fix bug with single_image_dataset which prevented working on multiple directories from working	2020-12-19 15:13:46 -07:00
James Betker	92f9a129f7	GLEAN!	2020-12-18 16:04:19 -07:00
James Betker	c717765bcb	Notes for lucidrains converter.	2020-12-18 09:55:38 -07:00
James Betker	1708136b55	Commit my attempt at "conforming" the lucidrains stylegan implementation to the reference spec. Not working. will probably be abandoned.	2020-12-18 09:51:48 -07:00
James Betker	d875ca8342	More refactor changes	2020-12-18 09:24:31 -07:00
James Betker	5640e4efe4	More refactoring	2020-12-18 09:18:34 -07:00
James Betker	b905b108da	Large cleanup Removed a lot of old code that I won't be touching again. Refactored some code elements into more logical places.	2020-12-18 09:10:44 -07:00
James Betker	2f0a52b7db	misc changes	2020-12-18 08:53:45 -07:00
James Betker	a8179ff53c	Image label work	2020-12-18 08:53:18 -07:00
James Betker	3074f41877	Get rosinality model converter to work Mostly, just needed to remove the custom cuda ops, not so bueno on Windows.	2020-12-17 16:03:39 -07:00
James Betker	e838c6e75b	Rosinality stylegan2 port	2020-12-17 14:18:46 -07:00
James Betker	12cf052889	Add an image patch labeling UI	2020-12-17 10:16:21 -07:00
James Betker	e5a3e6b9b5	srflow latent space misc	2020-12-14 23:59:49 -07:00
James Betker	1e14635d88	Add exclusions to extract_subimages_with_ref	2020-12-14 23:59:41 -07:00
James Betker	0a19e53df0	BYOL mods	2020-12-14 23:59:11 -07:00
James Betker	ec0ee25f4b	Structural latents checkpoint	2020-12-11 12:01:09 -07:00
James Betker	9c5e272a22	Script to extract models from a wrapped BYOL model	2020-12-10 09:57:52 -07:00
James Betker	5369cba8ed	Stage	2020-12-08 00:33:07 -07:00
James Betker	c0aeaabc31	Spinenet playground	2020-12-07 12:49:32 -07:00
James Betker	88fc049c8d	spinenet latent playground!	2020-12-05 20:30:36 -07:00
James Betker	11155aead4	Directly use dataset keys This has been a long time coming. Cleans up messy "GT" nomenclature and simplifies ExtensibleTraner.feed_data	2020-12-04 20:14:53 -07:00

1 2 3 4 5 ...

274 Commits