HIKARI-Altair-8B-SkinDx
Healthcare-oriented Intelligent Knowledge Augmented Retrieval and Inference
Named after Altair — bright star in Aquila, fast and direct, like single-image fine-tuning
📦 Model Type: Merged Full Model
This is a fully merged model — the LoRA adapter weights have been merged directly into the base model weights.
✅ No adapter loading needed. Load and run directly with
transformers,vLLM, orSGLang.💾 Size: ~17 GB (4 safetensor shards)
🔌 Lightweight adapter version: E27085921/HIKARI-Altair-8B-SkinDx-LoRA (~1.1 GB)
Overview
HIKARI-Altair is the single-image fine-tuning baseline for 10-class skin disease diagnosis. It is fine-tuned from Qwen/Qwen3-VL-8B-Thinking with Fuzzy Top-K label sampling, without cascade pretraining or RAG. Use this model to understand the baseline performance before cascade and RAG improvements.
| Property | Value |
|---|---|
| Task | 10-class skin disease diagnosis (Stage 2) |
| Base model | Qwen/Qwen3-VL-8B-Thinking |
| Training | Unsloth + LoRA, Fuzzy Top-K sampling |
| Val accuracy | 74.00% (99 samples, SkinCAP 3-stage split) |
| Model type | Merged full model |
10 Disease Classes
acne_vulgaris · atopic_dermatitis · melanocytic_nevi · psoriasis · sccis · seborrheic_dermatitis · skin_tag · tinea_versicolor · urticaria · photodermatoses
📊 Comparison with Other HIKARI Disease Models
| Model | Accuracy | Improvement over this model |
|---|---|---|
| HIKARI-Altair (this model) | 74.00% | — |
| HIKARI-Deneb (Cascade FT) | 79.80% | +5.80 pp |
| HIKARI-Sirius (RAG-in-Training) ⭐ | 85.86% | +11.86 pp |
🔧 Quick Inference — transformers
from transformers import Qwen3VLForConditionalGeneration, AutoProcessor
import torch
from PIL import Image
model_id = "E27085921/HIKARI-Altair-8B-SkinDx"
processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
model = Qwen3VLForConditionalGeneration.from_pretrained(
model_id, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True
)
image = Image.open("skin_lesion.jpg").convert("RGB")
group = "inflammatory" # from Stage 1 (HIKARI-Subaru)
PROMPT = (
"This skin lesion belongs to the group '{group}'. "
"Examine the lesion morphology (papules, plaques, macules), "
"color (red, violet, white, brown), scale/crust, border sharpness, "
"and distribution pattern. Based on these visual features, "
"what is the specific skin disease?"
)
messages = [{"role": "user", "content": [
{"type": "image", "image": image},
{"type": "text", "text": PROMPT.format(group=group)},
]}]
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=[text], images=[image], return_tensors="pt").to(model.device)
with torch.no_grad():
out = model.generate(**inputs, max_new_tokens=64, temperature=0.0, do_sample=False)
print(processor.batch_decode(out[:, inputs["input_ids"].shape[1]:], skip_special_tokens=True)[0].strip())
🔌 LoRA Adapter Version
from peft import PeftModel
from transformers import Qwen3VLForConditionalGeneration
import torch
base = Qwen3VLForConditionalGeneration.from_pretrained(
"Qwen/Qwen3-VL-8B-Thinking", torch_dtype=torch.bfloat16, device_map="auto"
)
model = PeftModel.from_pretrained(base, "E27085921/HIKARI-Altair-8B-SkinDx-LoRA")
→ E27085921/HIKARI-Altair-8B-SkinDx-LoRA
📄 Citation
@misc{hikari2026,
title = {HIKARI: RAG-in-Training for Skin Disease Diagnosis
with Cascaded Vision-Language Models},
author = {Watin Promfiy and Pawitra Boonprasart},
year = {2026},
institution = {King Mongkut's Institute of Technology Ladkrabang,
Department of Information Technology, Bangkok, Thailand}
}
Made with ❤️ at King Mongkut's Institute of Technology Ladkrabang (KMITL)
- Downloads last month
- 3