VQVAE2 in Pytorch

VQVAE2 is a generative autoencoder developed by Deepmind. It's unique innovation is discretizing the latent space into a fixed set of "codebook" vectors. This codebook can then be used in downstream tasks to rebuild images from the training set.

This model is in DLAS thanks to work @rosinality did converting the Deepmind model to Pytorch.

Training VQVAE2

VQVAE2 is trained in two steps:

Training the autoencoder

This first step is to train the autoencoder itself. The config file train_imgnet_vqvae_stage1.yml provided shows how to do this for imagenet with the hyperparameters specified by deepmind. You'll need to bring your own imagenet folder for this.

Training the PixelCNN encoder

The second step is to train the PixelCNN model which will create "codebook" vectors given an input image.

957 B Raw Blame History

VQVAE2 in Pytorch

Training VQVAE2

Training the autoencoder

Training the PixelCNN encoder

957 B

Raw Blame History