RePaViT: Scalable Vision Transformer Acceleration via Structural Reparameterization on Feedforward Network Layers [ICML2025] arXiv

This is the official model weights repository for RePaViT. For detailed instruction, please refer to https://github.com/Ackesnal/RePaViT.

0. Environment Setup

First, clone the repository locally:

git clone https://github.com/Ackesnal/RePaViT.git
cd RePaViT

Then, install environments via conda:

conda create -n repavit python=3.10 -y && conda activate repavit
conda install conda-forge::python-rocksdb -y
pip install torch torchvision torchaudio timm==1.0.3 einops ptflops wandb

[Recommended] Alternatively, you can directly install from the pre-defined environment YAML file as:

conda env create -f environment.yml

After finishing the above installations, it is ready to run this repo.

We further utilize the wandb for real-time tracking and training process visualization. The use of wandb is optional. However, you will need to register and login to wandb before using this functionality.

1. Dataset Preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class2/
      img4.jpeg

We provide support for RocksDB as an alternative dataset organization solution. In certain HPC environments where the number of allowable files is limited, the ImageNet dataset cannot be fully decompressed on high-speed I/O disks. In this case, RocksDB enables efficient and stable ImageNet data storing and loading, without the need for millions of small image files.

To insert ImageNet into a RocksDB database, simply run

python insert_rocksdb.py

(please replace tar_path_root and db_path_root in insert_rocksdb.py with your own source and target root paths).

When training the model, use the --rocksdb argument instead of --data_path to specify the database location.

2. Evaluation

2.1. Accuracy evaluation

To evaluate the prediction performance, please run the following code. Please ensure --idle_ratio is set to the same value as the pretrained model weight.

[RePaViT-Large] performance test:

torchrun --nproc_per_node=4 main.py \
  --model=RePaViT_Large \
  --batch_size=512 \
  --eval \
  --dist_eval \
  --channel_idle \
  --idle_ratio=0.75 \
  --feature_norm=BatchNorm \
  --data_path=/path/to/imagenet \
  --resume=/path/to/pretrained_weight.pth

For your convenience, we also provide one-line command below:

torchrun --nproc_per_node=4 main.py --model=RePaViT_Large --batch_size=512 --eval --dist_eval --channel_idle --idle_ratio=0.75 --feature_norm=BatchNorm --data_path=/path/to/imagenet --resume=/path/to/pretrained_weight.pth

2.2. Inference speed test

To test inference speed, --test_speed and --only_test_speed arguments should be utilized, and the number of processes is recommended to set to 1:

[RePaViT-Large] speed test:

torchrun --nproc_per_node=1 main.py \
  --model=RePaViT_Large \
  --channel_idle \
  --idle_ratio=0.75 \
  --feature_norm=BatchNorm \
  --test_speed

For your convenience, we also provide one-line command below:

torchrun --nproc_per_node=1 main.py --model=RePaViT_Large --channel_idle --idle_ratio=0.75 --feature_norm=BatchNorm --test_speed

2.3. Evaluation with Structural Reparameterization

To enable inference-stage model compression via structural reparameterization, you can simply add the argument --reparam as:

[RePaViT-Large] speed test after structural reparameterization:

torchrun --nproc_per_node=1 main.py \
  --model=RePaViT_Large \
  --channel_idle \
  --idle_ratio=0.75 \
  --feature_norm=BatchNorm \
  --test_speed \
  --reparam

For your convenience, we also provide one-line command below:

torchrun --nproc_per_node=1 main.py --model=RePaViT_Large --channel_idle --idle_ratio=0.75 --feature_norm=BatchNorm --test_speed --reparam

--reparam can be combined with performance evalutation as well. The prediction accuracy before and after reparameterization should be the same.

3. Supported Models

In this repo, we currently support the following backbone model(name)s:

RePaViT-Tiny (i.e., RePaDeiT-Tiny)
RePaViT-Small (i.e., RePaDeiT-Small)
RePaViT-Base (i.e., RePaDeiT-Base)
RePaViT-Large
RePaViT-Huge
RePaSwin-Tiny
RePaSwin-Small
RePaSwin-Base

4. Reference

If you use this repo or find it useful, please consider citing:

@inproceedings{xu2025repavit,
  title = {RePaViT: Scalable Vision Transformer Acceleration via Structural Reparameterization on Feedforward Network Layers},
  author = {Xu, Xuwei and Li, Yang and Chen, Yudong and Liu, Jiajun and Wang, Sen},
  booktitle = {The 42nd International Conference on Machine Learning (ICML)},
  year = {2025}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support