tsunghanwu
/

reverse_qwen25_vl

Image-Text-to-Text

text-generation-inference

Model card Files Files and versions

tsunghanwu commited on May 30

Commit

b3fb3b1

·

verified ·

1 Parent(s): 60f6e50

Upload folder using huggingface_hub

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -23,7 +23,7 @@ REVERSE-Qwen2.5-VL-3B is a novel open-source vision-language model (VLM) that pe
 REVERSE achieves **state-of-the-art hallucination reduction** across diverse captioning and open-ended visual question answering benchmarks. To ensure the apple-to-apple comparison, we fine-tune the released Qwen2.5-VL-3B model using both the LLaVA-FT setup and our REVERSE recipe, applying both on the same 100k subset. This allows us to directly compare the impact of our method against the LLaVA-FT baseline under consistent conditions as the Qwen2.5-VL's instruction tuning data is not publicly available.
 | Benchmark    | Metric                        | Qwen2.5-VL-FT    | REVERSE (τ=0.01) |
-| ------------ | ----------------------------- | ---------------- | ----------------- | ------------------ |
 | CHAIR-MSCOCO | CHAIRi (↓)                    | 12.2    | **10.5**                |
 |              | CHAIRs (↓)                    | 45.8       | **39.4**              |
 | AMBER-G      | CHAIR (↓)                     | 7.7        | **7.5**               |

 REVERSE achieves **state-of-the-art hallucination reduction** across diverse captioning and open-ended visual question answering benchmarks. To ensure the apple-to-apple comparison, we fine-tune the released Qwen2.5-VL-3B model using both the LLaVA-FT setup and our REVERSE recipe, applying both on the same 100k subset. This allows us to directly compare the impact of our method against the LLaVA-FT baseline under consistent conditions as the Qwen2.5-VL's instruction tuning data is not publicly available.
 | Benchmark    | Metric                        | Qwen2.5-VL-FT    | REVERSE (τ=0.01) |
+| ------------ | ----------------------------- | ---------------- | ----------------- |
 | CHAIR-MSCOCO | CHAIRi (↓)                    | 12.2    | **10.5**                |
 |              | CHAIRs (↓)                    | 45.8       | **39.4**              |
 | AMBER-G      | CHAIR (↓)                     | 7.7        | **7.5**               |