zhiyuanyou
/

DeQA-Score-Mix3

feature-extraction

Model card Files Files and versions

DeQA-Score-Mix3 / README.md

zhiyuanyou's picture

Add pipeline tag and library name (#2)

c318541 verified 7 months ago

|

history blame contribute delete

2.27 kB

	---
	base_model:
	- MAGAer13/mplug-owl2-llama2-7b
	language:
	- en
	license: mit
	pipeline_tag: image-to-text
	library_name: transformers
	---

	# DeQA-Score-Mix3

	DeQA-Score (
	[project page](https://depictqa.github.io/deqa-score/) /
	[codes](https://github.com/zhiyuanyou/DeQA-Score) /
	[paper](https://arxiv.org/abs/2501.11561)
	) model weights fully fine-tuned on KonIQ, SPAQ, and KADID datasets.

	This work is under our [DepictQA project](https://depictqa.github.io/).

	## Quick Start with AutoModel

	For this image, ![](https://raw.githubusercontent.com/zhiyuanyou/DeQA-Score/main/fig/singapore_flyer.jpg) start an AutoModel scorer with `transformers==4.36.1`:

	```python
	import requests
	import torch
	from transformers import AutoModelForCausalLM

	model = AutoModelForCausalLM.from_pretrained(
	"zhiyuanyou/DeQA-Score-Mix3",
	trust_remote_code=True,
	attn_implementation="eager",
	torch_dtype=torch.float16,
	device_map="auto",
	)

	from PIL import Image

	# The inputs should be a list of multiple PIL images
	score = model.score(
	[Image.open(requests.get(
	"https://raw.githubusercontent.com/zhiyuanyou/DeQA-Score/main/fig/singapore_flyer.jpg", stream=True
	).raw)]
	)
	```

	The "score" result should be 1.9404 (in range [1,5], higher is better).


	## Non-reference IQA Results (PLCC / SRCC)

	\| Dataset \| KonIQ \| SPAQ \| KADID \| PIPAL \| LIVE-Wild \| AGIQA \| TID2013 \| CSIQ \|
	\|--------------\|-----------\|----------\|----------\|----------\|-----------\|----------\|----------\|----------\|
	\| Q-Align (Baseline) \| 0.945 / 0.938 \| 0.933 / 0.931 \| 0.935 / 0.934 \| 0.409 / 0.420 \| 0.887 / 0.883 \| 0.788 / 0.733 \| 0.829 / 0.808 \| 0.876 / 0.845 \|
	\| DeQA-Score (Ours) \| 0.956 / 0.943 \| 0.938 / 0.934 \| 0.955 / 0.953 \| 0.495 / 0.496 \| 0.900 / 0.887 \| 0.808 / 0.745 \| 0.852 / 0.820 \| 0.900 / 0.857 \|


	If you find our work useful for your research and applications, please cite using the BibTeX:

	```bibtex
	@inproceedings{deqa_score,
	title={Teaching Large Language Models to Regress Accurate Image Quality Scores using Score Distribution},
	author={You, Zhiyuan and Cai, Xin and Gu, Jinjin and Xue, Tianfan and Dong, Chao},
	booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
	year={2025},
	}
	```