Makatia
/

TinyLlama_TinyLlama-1.1B-Chat-v1.0_onnx

 ---
 license: mit
+library_name: onnxruntime
+tags:
+  - tinyllama
+  - onnx
+  - quantized
+  - edge-llm
+  - raspberry-pi
+  - local-inference
+model_creator: TinyLlama
+language: en
 ---
+# TinyLlama-1.1B-Chat-v1.0 (ONNX): Local LLM Model Repository
+This repository contains quantized ONNX exports of [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0), optimized for efficient local inference on resource-constrained devices such as Raspberry Pi and other ARM-based single-board computers.
+---
+## 🟦 ONNX Model
+**Included files:**
+- `model.onnx`
+- `model_quantized.onnx`
+- `model.onnx.data` (if sharded)
+- Configuration files (`config.json`, `tokenizer.json`, etc.)
+**Recommended for:** ONNX Runtime, Kleidi AI, and other compatible frameworks.
+### Quick Start
+```python
+import onnxruntime as ort
+session = ort.InferenceSession("model.onnx")
+# ... inference code here ...
+```
+The ONNX export enables efficient inference on CPUs, NPUs, and other accelerators, making it ideal for local or edge deployments.
+---
+## 📋 Credits
+- **Base model:** [TinyLlama](https://huggingface.co/TinyLlama)
+- **ONNX export:** [Optimum](https://github.com/huggingface/optimum), [ONNX Runtime](https://github.com/microsoft/onnxruntime)
+- **Model optimization:** ARM-optimized for Raspberry Pi
+---
+**Maintainer:** [Makatia](https://huggingface.co/Makatia)