bartowski
/

tencent_Hunyuan-A13B-Instruct-GGUF

Text Generation

Model card Files Files and versions

bartowski commited on Jul 8

Commit

f73ed46

·

verified ·

1 Parent(s): d232187

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -12,6 +12,8 @@ license_name: tencent-hunyuan-a13b
 Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b5845">b5845</a> for quantization.
 Original model: https://huggingface.co/tencent/Hunyuan-A13B-Instruct
 All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)

 Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b5845">b5845</a> for quantization.
+Using additionally this fork/PR for extra MoE performance: https://github.com/ggml-org/llama.cpp/pull/12727
 Original model: https://huggingface.co/tencent/Hunyuan-A13B-Instruct
 All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)