|
--- |
|
base_model: |
|
- MAGAer13/mplug-owl2-llama2-7b |
|
language: |
|
- en |
|
license: mit |
|
pipeline_tag: image-to-text |
|
library_name: transformers |
|
--- |
|
|
|
# DeQA-Score-Mix3 |
|
|
|
DeQA-Score ( |
|
[project page](https://depictqa.github.io/deqa-score/) / |
|
[codes](https://github.com/zhiyuanyou/DeQA-Score) / |
|
[paper](https://arxiv.org/abs/2501.11561) |
|
) model weights fully fine-tuned on KonIQ, SPAQ, and KADID datasets. |
|
|
|
This work is under our [DepictQA project](https://depictqa.github.io/). |
|
|
|
## Quick Start with AutoModel |
|
|
|
For this image,  start an AutoModel scorer with `transformers==4.36.1`: |
|
|
|
```python |
|
import requests |
|
import torch |
|
from transformers import AutoModelForCausalLM |
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
"zhiyuanyou/DeQA-Score-Mix3", |
|
trust_remote_code=True, |
|
attn_implementation="eager", |
|
torch_dtype=torch.float16, |
|
device_map="auto", |
|
) |
|
|
|
from PIL import Image |
|
|
|
# The inputs should be a list of multiple PIL images |
|
score = model.score( |
|
[Image.open(requests.get( |
|
"https://raw.githubusercontent.com/zhiyuanyou/DeQA-Score/main/fig/singapore_flyer.jpg", stream=True |
|
).raw)] |
|
) |
|
``` |
|
|
|
The "score" result should be 1.9404 (in range [1,5], higher is better). |
|
|
|
|
|
## Non-reference IQA Results (PLCC / SRCC) |
|
|
|
| Dataset | KonIQ | SPAQ | KADID | PIPAL | LIVE-Wild | AGIQA | TID2013 | CSIQ | |
|
|--------------|-----------|----------|----------|----------|-----------|----------|----------|----------| |
|
| Q-Align (Baseline) | 0.945 / 0.938 | 0.933 / 0.931 | 0.935 / 0.934 | 0.409 / 0.420 | 0.887 / 0.883 | 0.788 / 0.733 | 0.829 / 0.808 | 0.876 / 0.845 | |
|
| DeQA-Score (Ours) | **0.956 / 0.943** | **0.938 / 0.934** | **0.955 / 0.953** | **0.495 / 0.496** | **0.900 / 0.887** | **0.808 / 0.745** | **0.852 / 0.820** | **0.900 / 0.857** | |
|
|
|
|
|
If you find our work useful for your research and applications, please cite using the BibTeX: |
|
|
|
```bibtex |
|
@inproceedings{deqa_score, |
|
title={Teaching Large Language Models to Regress Accurate Image Quality Scores using Score Distribution}, |
|
author={You, Zhiyuan and Cai, Xin and Gu, Jinjin and Xue, Tianfan and Dong, Chao}, |
|
booktitle={IEEE Conference on Computer Vision and Pattern Recognition}, |
|
year={2025}, |
|
} |
|
``` |