ggml-org/gguf-my-repo · Has anyone ever gotten this to work?

Jun 27

Do I just have bad luck? I've tried a bunch of repos (most recently THUDM/SWE-Dev-9B) and have always had it error out at some point.

Yuma42

Jun 27

Well I reported here exactly when the error happens and also wrote that it worked in the past.
https://huggingface.co/spaces/ggml-org/gguf-my-repo/discussions/158

But people keep opening new discussions or making new comments instead of voting it up so this place became a mess. Like you for example don't even tell us what your error is so I have to guess that it's the same I reported already.

I guess the project is abandoned if it wasn't fixed by now.

Fentible

Jun 28

•

edited Jul 9

For those who need features like local Windows support, lower-bit IQ quants, and a download-before-upload workflow, I've created an enhanced fork of this script.

You can find it here: https://huggingface.co/spaces/Fentible/gguf-repo-suite

Clone the repo to your own HF Space or locally using the Quick Start guides.

I could not get it to work on free HF spaces but it might be possible with a rented space. I tested on Windows 10 and made some quants for gemma 3 abliterated by mlabonne.

The bug: ggml-rpc.dll is very finnicky and it may require you to compile your own version of llama-imatrix to fix.

Offline needed for 27B+

Koitenshin

Jul 9

•

edited Jul 28

Worked fine for me, I now have a Q8_0 copy of Pixtral 12B Lumimaid.

From ~~https://huggingface.co/mrcuddle/Lumimaid-v0.2-12B-Pixtral to https://huggingface.co/Koitenshin/Lumimaid-v0.2-12B-Pixtral-Q8_0-GGUF~~

Did every quant option available using this space. Now available at https://huggingface.co/Koitenshin/Lumimaid_VISION-v0.2-12B-Pixtral-GGUF

in just a couple minutes. No mucking about with setting up my own environments, compiling llama.cpp, etc.

cob05

Aug 8

Another attempt, another failure...

Error converting to fp16: INFO:hf-to-gguf:Loading model: granite-vision-3.3-2b-embedding
WARNING:hf-to-gguf:Failed to load model config from downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding: The repository downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding contains custom code which must be executed to correctly load the model. You can inspect the repository content at /home/user/app/downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding .
 You can inspect the repository content at https://hf.co/downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:hf-to-gguf:Model architecture: GraniteForCausalLM
WARNING:hf-to-gguf:Failed to load model config from downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding: The repository downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding contains custom code which must be executed to correctly load the model. You can inspect the repository content at /home/user/app/downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding .
 You can inspect the repository content at https://hf.co/downloads/tmpp4dy37n8/granite-vision-3.3-2b-embedding.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: loading model part 'model-00001-of-00003.safetensors'
Traceback (most recent call last):
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 8595, in <module>
    main()
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 8589, in main
    model_instance.write()
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 410, in write
    self.prepare_tensors()
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 2126, in prepare_tensors
    super().prepare_tensors()
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 277, in prepare_tensors
    for new_name, data_torch in (self.modify_tensors(data_torch, name, bid)):
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 2036, in modify_tensors
    n_head = self.hparams["num_attention_heads"]
             ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'num_attention_heads'

cob05

Aug 11

Tried again... Failed again.

Error converting to fp16: INFO:hf-to-gguf:Loading model: MiniCPM-V-4
WARNING:hf-to-gguf:Failed to load model config from downloads/tmp9t9m7d0a/MiniCPM-V-4: The repository downloads/tmp9t9m7d0a/MiniCPM-V-4 contains custom code which must be executed to correctly load the model. You can inspect the repository content at /home/user/app/downloads/tmp9t9m7d0a/MiniCPM-V-4 .
 You can inspect the repository content at https://hf.co/downloads/tmp9t9m7d0a/MiniCPM-V-4.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:hf-to-gguf:Model architecture: MiniCPMV
ERROR:hf-to-gguf:Model MiniCPMV is not supported

cob05

27 days ago

Another try, and it fails once again... Never gotten it to work.

Error converting to fp16: INFO:hf-to-gguf:Loading model: Ovis2.5-9B
WARNING:hf-to-gguf:Failed to load model config from downloads/tmp6nkpckoz/Ovis2.5-9B: The repository downloads/tmp6nkpckoz/Ovis2.5-9B contains custom code which must be executed to correctly load the model. You can inspect the repository content at /home/user/app/downloads/tmp6nkpckoz/Ovis2.5-9B .
 You can inspect the repository content at https://hf.co/downloads/tmp6nkpckoz/Ovis2.5-9B.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:hf-to-gguf:Model architecture: Qwen3ForCausalLM
WARNING:hf-to-gguf:Failed to load model config from downloads/tmp6nkpckoz/Ovis2.5-9B: The repository downloads/tmp6nkpckoz/Ovis2.5-9B contains custom code which must be executed to correctly load the model. You can inspect the repository content at /home/user/app/downloads/tmp6nkpckoz/Ovis2.5-9B .
 You can inspect the repository content at https://hf.co/downloads/tmp6nkpckoz/Ovis2.5-9B.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: loading model part 'model-00001-of-00004.safetensors'
Traceback (most recent call last):
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 8788, in <module>
    main()
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 8782, in main
    model_instance.write()
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 425, in write
    self.prepare_tensors()
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 292, in prepare_tensors
    for new_name, data_torch in (self.modify_tensors(data_torch, name, bid)):
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 2923, in modify_tensors
    yield from super().modify_tensors(data_torch, name, bid)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 260, in modify_tensors
    return [(self.map_tensor_name(name), data_torch)]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/app/./llama.cpp/convert_hf_to_gguf.py", line 251, in map_tensor_name
    raise ValueError(f"Can not map tensor {name!r}")
ValueError: Can not map tensor 'llm.lm_head.weight'

Koitenshin

27 days ago

@cob05

You're intentionally testing it on models that Llama.cpp doesn't support yet, of course it's not going to work.

cob05

27 days ago

@cob05

You're intentionally testing it on models that Llama.cpp doesn't support yet, of course it's not going to work.

It's interesting that those 'unsupported' models have GGUF quants available though. This space literally says pick a repo and it will convert it to GGUF. What am I missing? Maybe they need to specify which models work and which don't so I stop wasting my time.

cob05 changed discussion status to closed 27 days ago

Koitenshin

27 days ago

Those quants are from people quantizing it on their own machines, most likely in a sandboxed environment due to the remote code.