🪔 GujjuGPT-v1 Model Card Cover

Model Name: GujjuGPT-v1

Model Type: Gujarati Language Large Language Model (LLM)

Overview

GujjuGPT-v1 is an advanced Gujarati-focused LLM designed to accelerate AI innovation for the Gujarati-speaking community. Developed by fine-tuning the robust llama2-7b-hf base model, GujjuGPT-v1 harnesses state-of-the-art techniques for efficient, memory-friendly performance.

Key Features

  • Language: Gujarati (trained on the comprehensive saghara dataset)
  • Architecture: Built on llama2-7b-hf, fine-tuned for Gujarati language understanding and generation
  • Tokenizer: Custom, tailored specifically for Gujarati language text processing
  • Training Methodology: Utilizes LoRA and PEFT for efficient parameter fine-tuning with reduced compute requirements
  • Model Size: 7 billion parameters (7B)
  • Precision: 8-bit — for optimal speed and resource utilization
  • Use Cases: Text generation, question-answering, translation, and more in Gujarati

🚀 Installation & Usage

from transformers import AutoModelForCausalLM, AutoTokenizer


tokenizer = AutoTokenizer.from_pretrained("cypher-hritam/gujjuGPT-v1")
model = AutoModelForCausalLM.from_pretrained(
    "cypher-hritam/gujjuGPT-v1",
    device_map="auto"
)

text = "ગુજરાતી ભાષામાં પાઠ લખો:"
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

🚧 Out of Scope

  • Not designed for tasks outside Gujarati natural language processing (no support for code, non-Gujarati languages, or multimodal inputs)
  • Not intended for medical, legal, or critical decision-making applications
  • No guarantees for ethical moderation or bias mitigation—users should pre- and post-process for responsible deployment
  • Not optimized or validated for real-time, low-latency production environments

Made with Llama2 | Fine-tuned on Saghara | Powered by LoRA & PEFT

Bringing the power of AI to the heart of Gujarat!

Downloads last month
11
Safetensors
Model size
7B params
Tensor type
F32
·
F16
·
I8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support

Model tree for cypher-hritam/gujjuGPT-v1

Quantized
(68)
this model

Dataset used to train cypher-hritam/gujjuGPT-v1