Trying to use original Safetensor model file, but Ollama build only a 3.6GB one

#169
by FatalPuppet - opened

I had issues building the model for Ollama using the model-x-of-y.safetensors files, provided all the required ones in the same folder, so tried to get the original 13.5GB file.
When I try to have it registered/created within Ollama (ollama create GPT-OSS-20B -f ./Modelfile) it starts correctly, create the appropriate Sha entries, but stops quite early for the file size:

Screenshot 2025-12-01 103340

went to look in the "model.safetensors.index.json" file to notice it was still indexing the original model-x-of-y.safetensors files and tried to make it point to the only model.safetensor file I now have.
Again it created a reduced size model:

Screenshot 2025-12-01 103528

Should I look into the other *.json files to see if they are pointing to the correct one?
Is there a better way for me to load correctly the safetensor in Ollama?
Tried to convert it to .gguf, but it fails constantly during the installation of the base requirement.

I'm running on a VM with 10 core @2.2GHz (Xion Platinum 8352Y) and 32GB RAM, no GPU available.

As an additional test, I tried to load the bare safetensor, tokenizer, and config files within a folder pointed by my Modelfile, same general limit of 3.5GB arise, tried both the config files from the general repository and the "original" one, the latter returned error of "Unknown architecture"

Sign up or log in to comment