llmware
/

llama-3.2-3b-instruct-onnx

Model card Files Files and versions

doberst commited on Oct 26, 2024

Commit

1fd779b

·

verified ·

1 Parent(s): 87b50de

Update README.md

Files changed (1) hide show

README.md +5 -6

README.md CHANGED Viewed

@@ -3,17 +3,16 @@ license: llama3.2
 inference: false
 tags:
 - green
-- p1
 - llmware-chat
 - ov
-- emerald
 ---
-# llama-3.2-1b-instruct-onnx
-**llama-3.2-1b-instruct-onnx** is an ONNX int4 quantized version of Llama 3.2 1B Instruct, providing a very small, very fast inference implementation, optimized for AI PCs using Intel GPU, CPU and NPU.
-[**llama-3.2-1b-instruct**](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct) is a new 1B chat foundation model from Meta.
 ### Model Description
@@ -21,7 +20,7 @@ tags:
 - **Developed by:** meta-llama
 - **Quantized by:** llmware
 - **Model type:** llama-3.2
-- **Parameters:** 1 billion
 - **Model Parent:** meta-llama/Meta-Llama-3.2-1B-Instruct
 - **Language(s) (NLP):** English
 - **License:** Llama 3.2 Community License

 inference: false
 tags:
 - green
+- p3
 - llmware-chat
 - ov
 ---
+# llama-3.2-3b-instruct-onnx
+**llama-3.2-3b-instruct-onnx** is an ONNX int4 quantized version of Llama 3.2 3B Instruct, providing a very small, very fast inference implementation, optimized for AI PCs using Intel GPU, CPU and NPU.
+[**llama-3.2-3b-instruct**](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct) is a new 3B chat foundation model from Meta.
 ### Model Description
 - **Developed by:** meta-llama
 - **Quantized by:** llmware
 - **Model type:** llama-3.2
+- **Parameters:** 3 billion
 - **Model Parent:** meta-llama/Meta-Llama-3.2-1B-Instruct
 - **Language(s) (NLP):** English
 - **License:** Llama 3.2 Community License