VQ-VAE for Tiny ImageNet (ImageNet-200)
This repository contains a Vector Quantized Variational Autoencoder (VQ-VAE) trained on the Tiny ImageNet-200 dataset using PyTorch. It is part of an image augmentation and representation learning pipeline for generative modeling and unsupervised learning tasks.
π§ Model Details
- Model Type: Vector Quantized Variational Autoencoder (VQ-VAE)
- Dataset: Tiny ImageNet (ImageNet-200)
- Epochs: 35
- Latent Space: Discrete codebook (vector quantization)
- Input Size: 64Γ64 RGB
- Loss Function: Mean Squared Error (MSE) + VQ commitment loss
- Final Training Loss: ~0.0292
- FID Score: ~102.87
- Architecture: 3-layer CNN Encoder & Decoder with quantization bottleneck
π¦ Files
generator.pt
β Trained VQ-VAE model weightsloss_curve.png
β Plot of training loss across 35 epochsfid_score.json
β FID evaluation result on 1000 generated samplesfid_real/
β 1000 real Tiny ImageNet samples used for FIDfid_fake/
β 1000 VQ-VAE reconstructions used for FID
π§ Usage
import torch
from models.vqvae.model import VQVAE
model = VQVAE()
model.load_state_dict(torch.load("generator.pt", map_location="cpu"))
model.eval()
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Evaluation results
- FID on Tiny ImageNet (ImageNet-200)self-reported102.870