Makatia's picture
Upload README.md with huggingface_hub
204d1a3 verified
metadata
license: mit
library_name: onnxruntime
tags:
  - tinyllama
  - onnx
  - quantized
  - edge-llm
  - raspberry-pi
  - local-inference
model_creator: TinyLlama
language: en

TinyLlama-1.1B-Chat-v1.0 (ONNX): Local LLM Model Repository

This repository contains quantized ONNX exports of TinyLlama/TinyLlama-1.1B-Chat-v1.0, optimized for efficient local inference on resource-constrained devices such as Raspberry Pi and other ARM-based single-board computers.


🟦 ONNX Model

Included files:

  • model.onnx
  • model_quantized.onnx
  • model.onnx.data (if sharded)
  • Configuration files (config.json, tokenizer.json, etc.)

Recommended for: ONNX Runtime, Kleidi AI, and other compatible frameworks.

Quick Start

import onnxruntime as ort

session = ort.InferenceSession("model.onnx")
# ... inference code here ...

The ONNX export enables efficient inference on CPUs, NPUs, and other accelerators, making it ideal for local or edge deployments.


πŸ“‹ Credits


Maintainer: Makatia