I have produced quantized exllamav2 version 6Gb and much faster inference

by sujitvasanth - opened Oct 9

Oct 9

•

Hi. Was able to run your 7B Image-Text-to-Text model on exllamav2 which now supports qwen2, kimi, vision tower.
Had to build a custom architecture configuration for exllamav2 to support your model
It allows for much faster inference with much lower VRAM use
https://huggingface.co/sujitvasanth/OpenCUA-7B-exl2
Im just uploading now

xywang626

XLang NLP Lab org Oct 10

Wow! That's cool! I can help add the link to the OpenCUA-7B readme, if you are OK with it.

Best,
Xinyuan

sujitvasanth

Oct 11

problem... on deeper testing the quantised model its having inconcsistency with visual understanding... I will need to look at the exllama custom model structure

sujitvasanth

Oct 12

still struggling with proper implementation on exllamav3... do you have a version tof 7B that uses standard qwen2.5vl standard architecture?
also would help to know how 2d rope is transformed to 1d... my model is seeing the image in out of sync patches as i can tell how they are ordered..

sujitvasanth

Oct 12

•

edited Oct 12

Hi @Xinyuan ..yes its all working now - I had to adjust the python inference script to get the image embeds aligned properly
the working repo (same weights, updated inference script) is available at https://huggingface.co/sujitvasanth/OpenCUA-7B-exl2
regarding "I can help add the link to the OpenCUA-7B readme" - yes please do this.

I have also developed a pipeline for much lower resource for deploying on local computers (ubuntu, windows) uses vnc or rdp to self host without need for a full vitual machine.

xywang626

XLang NLP Lab org Oct 12

Thank you! Great to hear it’s working now!

I have added the link to the OpenCUA github repo and the OpenCUA-7B README.

Also very nice idea on the lightweight local deployment — that could be quite useful for community users.

Best,
Xinyuan

xywang626 changed discussion status to closed Oct 12

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment