DL-Art-School/recipes/vqvae2/README.md

# VQVAE2 in Pytorch

[VQVAE2](https://arxiv.org/pdf/1906.00446.pdf) is a generative autoencoder developed by Deepmind. It's unique innovation is
discretizing the latent space into a fixed set of "codebook" vectors.  This codebook
can then be used in downstream tasks to rebuild images from the training set.

This model is in DLAS thanks to work [@rosinality](https://github.com/rosinality) did 
[converting the Deepmind model](https://github.com/rosinality/vq-vae-2-pytorch) to Pytorch.

# Training VQVAE2

VQVAE2 is trained in two steps:

## Training the autoencoder

This first step is to train the autoencoder itself. The config file `train_imgnet_vqvae_stage1.yml` provided shows how to do this
for imagenet with the hyperparameters specified by deepmind. You'll need to bring your own imagenet folder for this.

## Training the PixelCNN encoder

The second step is to train the PixelCNN model which will create "codebook" vectors given an
input image.
vqvae docs (unfinished) 2021-01-07 23:31:57 +00:00			`# VQVAE2 in Pytorch`

			`[VQVAE2](https://arxiv.org/pdf/1906.00446.pdf) is a generative autoencoder developed by Deepmind. It's unique innovation is`
			`discretizing the latent space into a fixed set of "codebook" vectors. This codebook`
			`can then be used in downstream tasks to rebuild images from the training set.`

			`This model is in DLAS thanks to work [@rosinality](https://github.com/rosinality) did`
			`[converting the Deepmind model](https://github.com/rosinality/vq-vae-2-pytorch) to Pytorch.`

			`# Training VQVAE2`

			`VQVAE2 is trained in two steps:`

			`## Training the autoencoder`

			This first step is to train the autoencoder itself. The config file `train_imgnet_vqvae_stage1.yml` provided shows how to do this
			`for imagenet with the hyperparameters specified by deepmind. You'll need to bring your own imagenet folder for this.`

			`## Training the PixelCNN encoder`

			`The second step is to train the PixelCNN model which will create "codebook" vectors given an`
			`input image.`