Ultra Zoom

A fast single image super-resolution (SISR) model for upscaling images without loss of detail. Ultra Zoom uses a two-stage "zoom in and enhance" strategy that uses a fast deterministic upscaling algorithm to zoom in and then enhances the image through a residual pathway that operates primarily in the low-resolution subspace of a deep neural network. As such, Ultra Zoom requires less resources than upscalers that predict every new pixel de novo - making it outstanding for real-time image processing.

Key Features

Fast and scalable: Instead of predicting the individual pixels of the upscaled image, Ultra Zoom uses a unique "zoom in and enhance" approach that combines the speed of deterministic bicubic interpolation with the power of a deep neural network.
Full RGB: Unlike many efficient SR models that only operate in the luminance domain, Ultra Zoom operates within the full RGB color domain enhancing both luminance and chrominance for the best possible quality.
Denoising and Deblurring: During the enhancement stage, the model removes multiple types of noise and blur making images look crisp and clean.

Demo

View at full resolution for best results. More comparisons can be found here.

Pretrained Models

The following pretrained models are available on HuggingFace Hub.

Name	Zoom	Num Channels	Hidden Ratio	Encoder Layers	Total Parameters
andrewdalpino/UltraZoom-2X	2X	48	2X	20	1.8M
andrewdalpino/UltraZoom-3X	3X	54	2X	30	3.5M
andrewdalpino/UltraZoom-4X	4X	96	2X	40	14M

Pretrained Example

If you'd just like to load the pretrained weights and do inference, getting started is as simple as in the example below. First, you'll need the ultrazoom and torchvision Python packages installed into your project.

pip install ultrazoom torchvision

Next, load the model weights from HuggingFace Hub and feed the network some images.

import torch

from torchvision.io import decode_image
from torchvision.transforms.v2 import ToDtype, ToPILImage

from ultrazoom.model import UltraZoom


model_name = "andrewdalpino/UltraZoom-2X"
image_path = "./dataset/bird.png"

model = UltraZoom.from_pretrained(model_name)

image_to_tensor = ToDtype(torch.float32, scale=True)
tensor_to_pil = ToPILImage()

image = decode_image(image_path, mode="RGB")

x = image_to_tensor(image).unsqueeze(0)

y_pred = model.upscale(x)

pil_image = tensor_to_pil(y_pred.squeeze(0))

pil_image.show()

Code Repository

The code repository can be found at https://github.com/andrewdalpino/UltraZoom.

References

Z. Liu, et al. A ConvNet for the 2020s, 2022.

J. Yu, et al. Wide Activation for Efficient and Accurate Image Super-Resolution, 2018.

J. Johnson, et al. Perceptual Losses for Real_time Style Transfer and Super-Resolution, 2016.

W. Shi, et al. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network, 2016.

T. Salimans, et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks, OpenAI, 2016.

andrewdalpino
/

UltraZoom-2X