Model Card for MiVOLO V2 model

π€ Space | π Github | π MiVOLO Paper (2023) π MiVOLO Paper (2024)
We introduce state-of-the-art multi-input transformer for age and gender estimation.
This model was trained on proprietary and open-source datasets.
MiVOLO V1 (224x224) architecture:
Inference Requirements and Model Introduction
- Resolution: Width and height of face/body crops must be
384px
- Precision: FP32 / FP16
mivolo
library
pip install git+https://github.com/WildChlamydia/MiVOLO.git
- transformers==4.51.0
- accelerate==1.8.1
Quick start
from transformers import AutoModelForImageClassification, AutoConfig, AutoImageProcessor
import torch
import cv2
import numpy as np
import requests
# load model and image processor
config = AutoConfig.from_pretrained(
"iitolstykh/mivolo_v2", trust_remote_code=True
)
mivolo_model = AutoModelForImageClassification.from_pretrained(
"iitolstykh/mivolo_v2", trust_remote_code=True, torch_dtype=torch.float16
)
image_processor = AutoImageProcessor.from_pretrained(
"iitolstykh/mivolo_v2", trust_remote_code=True
)
# download test image
resp = requests.get('https://variety.com/wp-content/uploads/2023/04/MCDNOHA_SP001.jpg')
arr = np.asarray(bytearray(resp.content), dtype=np.uint8)
image = cv2.imdecode(arr, -1)
# face crops
x1, y1, x2, y2 = [625, 46, 686, 121]
faces_crops = [image[y1:y2, x1:x2]] # may be [None] if bodies_crops is not None
# body crops
x1, y1, x2, y2 = [534, 16, 790, 559]
bodies_crops = [image[y1:y2, x1:x2]] # may be [None] if faces_crops is not None
# prepare BGR inputs
faces_input = image_processor(images=faces_crops)["pixel_values"]
body_input = image_processor(images=bodies_crops)["pixel_values"]
faces_input = faces_input.to(dtype=mivolo_model.dtype, device=mivolo_model.device)
body_input = body_input.to(dtype=mivolo_model.dtype, device=mivolo_model.device)
# inference
output = mivolo_model(faces_input=faces_input, body_input=body_input)
# print results
age = output.age_output[0].item()
print(f"age: {round(age, 2)}")
id2label = config.gender_id2label
gender = id2label[output.gender_class_idx[0].item()]
gender_prob = output.gender_probs[0].item()
print(f"gender: {gender} [{int(gender_prob * 100)}%]")
Model Metrics
Model | Test Dataset | Age Accuracy | Gender Accuracy |
---|---|---|---|
mivolov2_384x384 (fp16) | Adience | 70.2 | 97.3 |
Citation
π If you find our work helpful, please consider citing our papers and leaving valuable stars
@article{mivolo2023,
Author = {Maksim Kuprashevich and Irina Tolstykh},
Title = {MiVOLO: Multi-input Transformer for Age and Gender Estimation},
Year = {2023},
Eprint = {arXiv:2307.04616},
}
@article{mivolo2024,
Author = {Maksim Kuprashevich and Grigorii Alekseenko and Irina Tolstykh},
Title = {Beyond Specialization: Assessing the Capabilities of MLLMs in Age and Gender Estimation},
Year = {2024},
Eprint = {arXiv:2403.02302},
}
License
Please, see here.
- Downloads last month
- 3,567
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support