Gemma 2B TFLite Model for Mobile

This repository contains a TensorFlow Lite version of the Gemma 2B Instruct model, optimized for mobile deployment.

Model Information

  • Base Model: Gemma 2B Instruct
  • Format: TensorFlow Lite
  • Quantization: INT4 (GPU optimized)
  • Model Size: 1.1 GB
  • Framework: TensorFlow Lite

Files

  • gemma-2b-it-gpu-int4.tflite - The quantized TFLite model
  • tokenizer.model - SentencePiece tokenizer

Usage

Download in Flutter App

const modelUrl = 'https://huggingface.co/mayur1496/gemma-2b-tflite/resolve/main/gemma-2b-it-gpu-int4.tflite';
const tokenizerUrl = 'https://huggingface.co/mayur1496/gemma-2b-tflite/resolve/main/tokenizer.model';

// Use dio or http package to download

Load Model

import 'package:tflite_flutter/tflite_flutter.dart';

final interpreter = await Interpreter.fromAsset('gemma-2b-it-gpu-int4.tflite');

Requirements

  • RAM: 4GB+ recommended
  • Storage: 1.5GB free space
  • Platform: Android (API 21+) / iOS (12.0+)

License

This model is released under the Gemma license. See the Gemma License for details.

Citation

@article{gemma_2024,
  title={Gemma: Open Models Based on Gemini Technology},
  author={Gemma Team},
  year={2024}
}
Downloads last month
21
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support