Update README.md
Browse files
README.md
CHANGED
@@ -2,6 +2,9 @@
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
|
|
|
|
|
|
|
5 |
# VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations for Synthetic Videos
|
6 |
|
7 |
[Zongxia Li*](https://zli12321.github.io/), [Xiyang Wu*](https://wuxiyang1996.github.io/), [Yubin Qin](https://www.linkedin.com/in/yubin-qin/), [Guangyao Shi](https://guangyaoshi.github.io/), [Hongyang Du](https://www.linkedin.com/in/hongyangdu/), [Dinesh Manocha](https://www.cs.umd.edu/people/dmanocha), [Tianyi Zhou](https://tianyizhou.github.io/), [Jordan Lee Boyd-Graber](https://users.umiacs.umd.edu/~ying/)
|
@@ -9,6 +12,10 @@ license: apache-2.0
|
|
9 |
[[๐ Paper](https://arxiv.org/abs/2505.01481)] [[๐ค Dataset](https://huggingface.co/datasets/IntelligenceLab/VideoHallu)][[๐Website](https://wuxiyang1996.github.io/videohallu_page/)]
|
10 |
|
11 |
|
|
|
|
|
|
|
|
|
12 |
|
13 |
## ๐ About VideoHallu
|
14 |
|
@@ -25,7 +32,8 @@ We also use GRPO to train [Qwen-2.5-VL-7B](https://huggingface.co/Qwen/Qwen2.5-V
|
|
25 |
|
26 |
|
27 |
## ๐
<a name='rb'></a>Reward Model
|
28 |
-
|
|
|
29 |
|
30 |
#### Method: `compute_score`
|
31 |
**Parameters**
|
@@ -75,6 +83,16 @@ If you find our work helpful for your research, please consider citing our work.
|
|
75 |
url={https://arxiv.org/abs/2501.02189},
|
76 |
}
|
77 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
78 |
@misc{guan2024hallusionbenchadvanceddiagnosticsuite,
|
79 |
title={HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models},
|
80 |
author={Tianrui Guan and Fuxiao Liu and Xiyang Wu and Ruiqi Xian and Zongxia Li and Xiaoyu Liu and Xijun Wang and Lichang Chen and Furong Huang and Yaser Yacoob and Dinesh Manocha and Tianyi Zhou},
|
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
|
5 |
+
|
6 |
+
|
7 |
+
|
8 |
# VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations for Synthetic Videos
|
9 |
|
10 |
[Zongxia Li*](https://zli12321.github.io/), [Xiyang Wu*](https://wuxiyang1996.github.io/), [Yubin Qin](https://www.linkedin.com/in/yubin-qin/), [Guangyao Shi](https://guangyaoshi.github.io/), [Hongyang Du](https://www.linkedin.com/in/hongyangdu/), [Dinesh Manocha](https://www.cs.umd.edu/people/dmanocha), [Tianyi Zhou](https://tianyizhou.github.io/), [Jordan Lee Boyd-Graber](https://users.umiacs.umd.edu/~ying/)
|
|
|
12 |
[[๐ Paper](https://arxiv.org/abs/2505.01481)] [[๐ค Dataset](https://huggingface.co/datasets/IntelligenceLab/VideoHallu)][[๐Website](https://wuxiyang1996.github.io/videohallu_page/)]
|
13 |
|
14 |
|
15 |
+
# Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation
|
16 |
+
[[๐ Paper](https://arxiv.org/abs/2506.15068)]
|
17 |
+
|
18 |
+
|
19 |
|
20 |
## ๐ About VideoHallu
|
21 |
|
|
|
32 |
|
33 |
|
34 |
## ๐
<a name='rb'></a>Reward Model
|
35 |
+
- RewardBert is specifically targeted for free-form GRPO training, where the answers cannot be evaluated based on simple correctness.
|
36 |
+
- We use [ModernBERT](https://huggingface.co/docs/transformers/en/model_doc/modernbert) as the base model to finetune on [MOCHA](https://arxiv.org/abs/2010.03636), [Prometheus-preference](https://huggingface.co/datasets/prometheus-eval/Preference-Collection), [Pedants](https://arxiv.org/abs/2402.11161) to evaluate free-form text generations. We use RewardBert as the reward in GRPO finetuning.
|
37 |
|
38 |
#### Method: `compute_score`
|
39 |
**Parameters**
|
|
|
83 |
url={https://arxiv.org/abs/2501.02189},
|
84 |
}
|
85 |
|
86 |
+
@misc{li2025semanticallyawarerewardsopenendedr1,
|
87 |
+
title={Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation},
|
88 |
+
author={Zongxia Li and Yapei Chang and Yuhang Zhou and Xiyang Wu and Zichao Liang and Yoo Yeon Sung and Jordan Lee Boyd-Graber},
|
89 |
+
year={2025},
|
90 |
+
eprint={2506.15068},
|
91 |
+
archivePrefix={arXiv},
|
92 |
+
primaryClass={cs.CL},
|
93 |
+
url={https://arxiv.org/abs/2506.15068},
|
94 |
+
}
|
95 |
+
|
96 |
@misc{guan2024hallusionbenchadvanceddiagnosticsuite,
|
97 |
title={HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models},
|
98 |
author={Tianrui Guan and Fuxiao Liu and Xiyang Wu and Ruiqi Xian and Zongxia Li and Xiaoyu Liu and Xijun Wang and Lichang Chen and Furong Huang and Yaser Yacoob and Dinesh Manocha and Tianyi Zhou},
|