Model Card for MiVOLO V2 model

πŸ€— Space | 🌐 Github | πŸ“œ MiVOLO Paper (2023) πŸ“œ MiVOLO Paper (2024)

We introduce state-of-the-art multi-input transformer for age and gender estimation.

This model was trained on proprietary and open-source datasets.

MiVOLO V1 (224x224) architecture:

img

Inference Requirements and Model Introduction

  • Resolution: Width and height of face/body crops must be 384px
  • Precision: FP32 / FP16
  • mivolo library
pip install git+https://github.com/WildChlamydia/MiVOLO.git
  • transformers==4.51.0
  • accelerate==1.8.1

Quick start

from transformers import AutoModelForImageClassification, AutoConfig, AutoImageProcessor
import torch
import cv2
import numpy as np
import requests

# load model and image processor
config = AutoConfig.from_pretrained(
    "iitolstykh/mivolo_v2", trust_remote_code=True
)
mivolo_model = AutoModelForImageClassification.from_pretrained(
    "iitolstykh/mivolo_v2", trust_remote_code=True, torch_dtype=torch.float16
)
image_processor = AutoImageProcessor.from_pretrained(
    "iitolstykh/mivolo_v2", trust_remote_code=True
)

# download test image
resp = requests.get('https://variety.com/wp-content/uploads/2023/04/MCDNOHA_SP001.jpg')
arr = np.asarray(bytearray(resp.content), dtype=np.uint8)
image = cv2.imdecode(arr, -1)

# face crops
x1, y1, x2, y2 = [625,  46, 686, 121]
faces_crops = [image[y1:y2, x1:x2]]  # may be [None] if bodies_crops is not None

# body crops
x1, y1, x2, y2 = [534,  16, 790, 559]
bodies_crops = [image[y1:y2, x1:x2]]  # may be [None] if faces_crops is not None

# prepare BGR inputs
faces_input = image_processor(images=faces_crops)["pixel_values"]
body_input = image_processor(images=bodies_crops)["pixel_values"]

faces_input = faces_input.to(dtype=mivolo_model.dtype, device=mivolo_model.device)
body_input = body_input.to(dtype=mivolo_model.dtype, device=mivolo_model.device)

# inference
output = mivolo_model(faces_input=faces_input, body_input=body_input)

# print results
age = output.age_output[0].item()
print(f"age: {round(age, 2)}")

id2label = config.gender_id2label
gender = id2label[output.gender_class_idx[0].item()]
gender_prob = output.gender_probs[0].item()
print(f"gender: {gender} [{int(gender_prob * 100)}%]")

Model Metrics

Model Test Dataset Age Accuracy Gender Accuracy
mivolov2_384x384 (fp16) Adience 70.2 97.3

Citation

🌟 If you find our work helpful, please consider citing our papers and leaving valuable stars

@article{mivolo2023,
   Author = {Maksim Kuprashevich and Irina Tolstykh},
   Title = {MiVOLO: Multi-input Transformer for Age and Gender Estimation},
   Year = {2023},
   Eprint = {arXiv:2307.04616},
}
@article{mivolo2024,
   Author = {Maksim Kuprashevich and Grigorii Alekseenko and Irina Tolstykh},
   Title = {Beyond Specialization: Assessing the Capabilities of MLLMs in Age and Gender Estimation},
   Year = {2024},
   Eprint = {arXiv:2403.02302},
}

License

Please, see here.

Downloads last month
3,567
Safetensors
Model size
28.8M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support