It looks like there is incorrect limit on the model context length. The fp16 like the original one have 131072 length. Updating this value resolved errors while processing longer prompts.

by dtrawins - opened Sep 3

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

-1

dtrawins

Sep 3

No description provided.

It looks like there is incorrect limit on the model context length. The fp16 like the original one have 131072 length. Updating this value resolved errors while processing longer prompts.b419af4e

amokrov

OpenVINO Toolkit org 18 days ago

This is a known issue and a current limitation of the INT4 model. When optimum-intel allows preserving the original max_position_embeddings, we will re-upload the model.

amokrov changed pull request status to closed 18 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment