π¦’ Swin S3 Base (224) - Pascal VOC
A Swin S3 Base model fine-tuned on the Pascal VOC 2012 dataset for multi-class image classification.
π§ Model Details
- Architecture: Swin S3 Base (
224x224
input size) - Pretrained on: ImageNet-1k
- Fine-tuned on: Pascal VOC 2012
- Framework: PyTorch (
timm
implementation) - Format:
safetensors
π― Intended Use
- Primary task: Image classification of natural scenes featuring objects from 20 Pascal VOC categories.
- Users: Researchers, developers working on computer vision applications, model benchmarking.
- Not intended for: Real-time decision making in critical applications (e.g., autonomous vehicles, medical diagnosis).
β οΈ Limitations and Ethical Considerations
- Biases: The model inherits biases present in Pascal VOC, such as underrepresentation of certain object types, contexts, or demographics. It may perform poorly on out-of-distribution samples.
- Ethical Use: Avoid using this model for applications that could reinforce harmful stereotypes, cause social harm, or violate privacy (e.g., surveillance).
- Transparency: This model is shared for research and educational use and should not be deployed without thorough fairness, robustness, and security evaluations.
βοΈ Training Details
- Training library:
timm
+ PyTorch - Epochs: 5
- Batch size: 16
- Optimizer: AdamW
- Learning rate: 5e-5
- Scheduler: Cosine Annealing
- Loss function: BCE
- Hardware: 1x NVIDIA A100 on Google Colab Pro
βΉοΈ Link to experiment tracking dashboard (e.g., Weights & Biases) (optional)
π Evaluation Results
Evaluated on Pascal VOC 2012 test set:
Metric | Value |
---|---|
roc_auc | 98.9% |
Note: Evaluation performed using standard multi-class metrics. Model was not evaluated on cross-domain generalization.
π Dataset
- Name: Pascal VOC 2012
- License: Creative Commons Attribution 4.0 International
- Labels: 20 object categories (person, car, dog, etc.)
- Split used: Training for fine-tuning, validation for evaluation
πΎ Files in This Repository
model.safetensors
: Model weightsREADME.md
: Model card (this file)
π Citations
@inproceedings{liu2021swin,
title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows},
author={Liu, Ze and Lin, Yutong and Cao, Yu and Hu, Han and Wei, Yixuan and Zhang, Zheng and Lin, Stephen and Guo, Baining},
booktitle={ICCV},
year={2021}
}
@article{Everingham10,
author = {Everingham, M. and Van Gool, L. and Williams, C. K. I. and Winn, J. and Zisserman, A.},
title = {The Pascal Visual Object Classes (VOC) Challenge},
journal = {IJCV},
year = {2010},
volume = {88},
number = {2},
pages = {303--338}
}
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for fylex/swin-s3-base-pascal_test
Base model
BobMcDear/swin_s3_base_224