reproducing: "Ensemble everything everywhere: Multi-scale aggregation for adversarial robustness" (https://arxiv.org/abs/2408.05446)
source code and usage examples: https://github.com/ETH-DISCO/self-ensembling
architecture based on Torchvision's Resnet152 default implementation
hyperparameters:
- criterion:
torch.nn.CrossEntropyLoss() - optimizer:
torch.optim.AdamW - scaler:
GradScaler - datasets:
["cifar10", "cirfar100"] - lr:
0.0001 - num_epochs:
16(higher would be even better, but maybe by <1%) - crossmax_k:
2(difference betweencrossmax_k=2andcrossmax_k=3is about 1-2%, so it's not a big deal)
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support