Trying to use original Safetensor model file, but Ollama build only a 3.6GB one

#169

by FatalPuppet - opened 9 days ago

9 days ago

I had issues building the model for Ollama using the model-x-of-y.safetensors files, provided all the required ones in the same folder, so tried to get the original 13.5GB file.
When I try to have it registered/created within Ollama (ollama create GPT-OSS-20B -f ./Modelfile) it starts correctly, create the appropriate Sha entries, but stops quite early for the file size:

went to look in the "model.safetensors.index.json" file to notice it was still indexing the original model-x-of-y.safetensors files and tried to make it point to the only model.safetensor file I now have.
Again it created a reduced size model:

Should I look into the other *.json files to see if they are pointing to the correct one?
Is there a better way for me to load correctly the safetensor in Ollama?
Tried to convert it to .gguf, but it fails constantly during the installation of the base requirement.

I'm running on a VM with 10 core @2.2GHz (Xion Platinum 8352Y) and 32GB RAM, no GPU available.

FatalPuppet

9 days ago

As an additional test, I tried to load the bare safetensor, tokenizer, and config files within a folder pointed by my Modelfile, same general limit of 3.5GB arise, tried both the config files from the general repository and the "original" one, the latter returned error of "Unknown architecture"

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment