Model Card for Model ID

Model Details

Model Description

AWQ quant for nnethercott/llava-v1.5-7b-gpt4OCR-hf. autoawq quantization config in files.

The two datasets used for fine tuning are:

We use 10k samples from GRIT where each sample has an image-caption CLIP similarity larger than 0.35 and where the caption does not contain any proper nouns (filtered using spaCy).

How to Get Started with the Model

Use the code below to get started with the model:

from transformers import (
    AutoProcessor,
)
from awq import AutoAWQForCausalLM
import time 

import requests 
from PIL import Image  
import torch 

awq_model_id = "/home/nathaniel/models/llava/llava-v1.5-7b-ocr-pretrain-hf-AWQ"
processor = AutoProcessor.from_pretrained(awq_model_id)
model = AutoAWQForCausalLM.from_quantized(awq_model_id, safetensors=True, device_map={"": 0}, fuse_layers=False)


image = "https://adquick-public.imgix.net/landing+images/media_formats/billboard-carvana.png?auto=format"
prompt = "USER:<image>/ngenerate a descriptive caption for this image. ASSISTANT: "
image = Image.open(requests.get(image_file, stream=True).raw).convert("RGB")

with torch.no_grad():
    inputs = processor(prompt, image, return_tensors = 'pt').to(0, torch.float16)

    start = time.perf_counter()
    out = model.generate(
        **inputs, 
        **generation_kwargs,
    )
    stop = time.perf_counter()

    print(processor.tokenizer.batch_decode(out[:,len(processor.tokenizer.encode(args.prompt)):], skip_special_tokens = True)[0])
    print(f'generation speed: {round(len(out[0])/(stop-start), 1)} [t/s]')

Output for nnethercott/llava-v1.5-7b-gpt4OCR-hf-AWQ:

The image captures a Carvana billboard under a clear blue sky, showcasing a red sports car being towed by a white Carvana truck. The billboard prominently features the Carvana logo and the slogan "Buy your next car from your couch.

Downloads last month
14
Safetensors
Model size
1.45B params
Tensor type
I32
·
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train nnethercott/llava-v1.5-7b-gpt4OCR-hf-AWQ

Collection including nnethercott/llava-v1.5-7b-gpt4OCR-hf-AWQ