harvey2333
/

omni_video_assistant_6_1

Text Generation

Transformers

PyTorch

omni

Model card Files Files and versions

xet

Community

harvey2333 commited on Dec 10, 2023

Commit

9249cbd

1 Parent(s): a6fe732

Update README.md

Browse files

Files changed (1) hide show

README.md +2 -64

README.md CHANGED Viewed

@@ -4,71 +4,9 @@ license: apache-2.0
 # Omni-VideoAssistant
 This is a Video Question Answering Large Language model.
-[code base](https://github.com/wanghao-cst/Omni-VideoAssistant).
 ## 📝 Updates
 * **[2023.12.09]**  🤗[Hugging Face](https://huggingface.co/harvey2333/omni_video_assistant_6_1) **A Better Model V6.1** are available now! Welcome to **watch** this repository for the latest updates.
 * **[2023.12.06]**  Gradio & CLI **Inference Demo** are available now.
-* **[2023.12.01]**  🤗[Hugging Face](https://huggingface.co/harvey2333/omni_video_assistant_5_3) **Preview Model** are available now!
-<details open><summary>💡 I also have other video-language projects that may interest you ✨. </summary><p>
-<!--  may -->
-> [**OmniDataComposer: A Unified Data Structure for Multimodal Data Fusion and Infinite Data Generation**](https://arxiv.org/abs/2308.04126) <br>
-> Dongyang Yu, Shihao Wang, Yuan Fang, Wangpeng An <br>
-[![github](https://img.shields.io/badge/-Github-black?logo=github)](https://github.com/shajiayu1/MVCE/) [![arXiv](https://img.shields.io/badge/Arxiv-2310.01852-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2308.04126) <br></p></details>
-## 🔨 Preparation
-```bash
-git clone https://github.com/wanghao-cst/Omni-VideoAssistant
-cd Omni-VideoAssistant
-```
-```shell
-conda create -n omni python=3.10 -y
-conda activate omni
-pip install --upgrade pip
-pip install -e .
-```
-## 🌟 Start here
-### Download Omni Preview Model
-Download for CLI inference only, gradio web UI will download it automatically.
-[Omni Preview Model 6.1](https://huggingface.co/harvey2333/omni_video_assistant_6_1)
-### Inference in Gradio Web UI
-```Shell
-CUDA_VISIBLE_DEVICES=0 python -m  llava.serve.gradio_demo
-```
-<p align="left">
-<img src="assets/gradio_demo.png" width=100%>
-</p>
-### Inference in CLI
-```
-CUDA_VISIBLE_DEVICES=0 python -m llava.eval.run_omni \
-    --model-path "path to omni checkpoints" \
-    --image-file "llava/serve/examples/extreme_ironing.jpg" \
-    --query "What is unusual about this image?"
-CUDA_VISIBLE_DEVICES=0 python -m llava.eval.run_omni \
-    --model-path "path to omni checkpoints" \
-    --video-file "llava/serve/examples/0A8CF.mp4" \
-    --query "Describe the activity in the video"
-```
-## 🔥 Results Comparision (based on model 5.3, evaluation on 6.1 is doing)
-### Image understanding
-<p align="left">
-<img src="assets/val_img.png" width=100%>
-</p>
-### Video understanding
-<p align="left">
-<img src="assets/val_vid.png" width=100%>
-</p>
-## 😊 Acknowledgment
-This work is based on [MVCE for unlimited training data generation.](https://github.com/shajiayu1/MVCE/), [LLaVA for pretrained model](https://github.com/haotian-liu/LLaVA/)

 # Omni-VideoAssistant
 This is a Video Question Answering Large Language model.
+[code base is here for more details:](https://github.com/wanghao-cst/Omni-VideoAssistant).
 ## 📝 Updates
 * **[2023.12.09]**  🤗[Hugging Face](https://huggingface.co/harvey2333/omni_video_assistant_6_1) **A Better Model V6.1** are available now! Welcome to **watch** this repository for the latest updates.
 * **[2023.12.06]**  Gradio & CLI **Inference Demo** are available now.
+* **[2023.12.01]**  🤗[Hugging Face](https://huggingface.co/harvey2333/omni_video_assistant_5_3) **Preview Model** are available now!