Text Generation
Transformers
Safetensors
English
Hindi
Kannada
qwen2
text-generation-inference
unsloth
trl
multilingual
hindi
kannada
hinglish
kannadish
indian-languages
conversational
4-bit precision
bitsandbytes
Instructions to use Bharatdeep-H/qwen2.5-14b-desi with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Bharatdeep-H/qwen2.5-14b-desi with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Bharatdeep-H/qwen2.5-14b-desi") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Bharatdeep-H/qwen2.5-14b-desi") model = AutoModelForCausalLM.from_pretrained("Bharatdeep-H/qwen2.5-14b-desi") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Bharatdeep-H/qwen2.5-14b-desi with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Bharatdeep-H/qwen2.5-14b-desi" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Bharatdeep-H/qwen2.5-14b-desi", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Bharatdeep-H/qwen2.5-14b-desi
- SGLang
How to use Bharatdeep-H/qwen2.5-14b-desi with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Bharatdeep-H/qwen2.5-14b-desi" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Bharatdeep-H/qwen2.5-14b-desi", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Bharatdeep-H/qwen2.5-14b-desi" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Bharatdeep-H/qwen2.5-14b-desi", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use Bharatdeep-H/qwen2.5-14b-desi with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Bharatdeep-H/qwen2.5-14b-desi to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Bharatdeep-H/qwen2.5-14b-desi to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Bharatdeep-H/qwen2.5-14b-desi to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Bharatdeep-H/qwen2.5-14b-desi", max_seq_length=2048, ) - Docker Model Runner
How to use Bharatdeep-H/qwen2.5-14b-desi with Docker Model Runner:
docker model run hf.co/Bharatdeep-H/qwen2.5-14b-desi
Qwen-Desi: Multilingual Indian Language Model
Model Description
Qwen-Desi is a multilingual language model fine-tuned from Sarvam-AI's Sarvam M model, specifically designed to support Indian languages and their transliterated variants. This model excels at understanding and generating text in English, Hindi, Kannada, and their English-transliterated forms (Hinglish and Kannadish).
- Developed by: Anshuman Suresh and Bharatdeep Hazarika
- Model type: Causal Language Model
- Base model: unsloth/Qwen2.5-14B-Instruct-unsloth-bnb-4bit
- Parent model: sarvamai/sarvam-m (24B Mistral)
- Language(s): English, Hindi (Devanagari), Hinglish, Kannada, Kannadish
- License: Apache-2.0
Supported Languages
- English: Standard English language
- Hindi: Written in Devanagari script (हिंदी)
- Hinglish: Hindi words written in English script (transliterated)
- Kannada: Written in Kannada script (ಕನ್ನಡ)
- Kannadish: Kannada words written in English script (transliterated)
Training Details
This Qwen2 model was efficiently fine-tuned using:
- Training Framework: Unsloth (2x faster training)
- Library: Hugging Face TRL (Transformer Reinforcement Learning)
- Base model: unsloth/Qwen2.5-14B-Instruct-unsloth-bnb-4bit
- Parent model: sarvamai/sarvam-m
- Language(s): English, Hindi (Devanagari), Hinglish, Kannada, Kannadish
- Architecture: Qwen2-based transformer
Usage
Direct Usage with OpenAI Compatible Package
import json
from openai import OpenAI
client = OpenAI(
base_url="http://<URL>/v1/",
api_key="your-api-key"
)
messages = [
{
"role": "system",
"content": """
You are a helpful assistant. You support five languages: English, Hindi, Hinglish, Kannada and Kannadish.
English is the standard English language. Hindi is written in Devanagari script.
Hinglish refers to Hindi words written in English script (Hindi transliterated to English).
Kannada is written in Kannada script. Kannadish refers to Kannada words written in English script
(Kannada transliterated to English). Infer user's query and answer in Kannadish (English script).
"""
},
{
"role": "user",
"content": "Mujhe Lebron James ke baare mai info do"
},
]
response = client.chat.completions.create(
model="Bharatdeep-H/qwen2.5-14b-desi",
messages=messages,
temperature=0.3,
max_tokens=3096,
frequency_penalty=0,
presence_penalty=1.05,
top_p=0.2,
seed=42,
stream=True,
stream_options={"include_usage": True},
)
for token in response:
if hasattr(token, 'choices') and token.choices[0].delta.content:
print(token.choices[0].delta.content, end='', flush=True)
Model Capabilities
- Code-mixed conversations: Seamlessly handles conversations mixing English with Hindi/Kannada
- Script flexibility: Understands both native scripts and transliterated text
- Multi-turn dialogue: Maintains context across conversation turns
- Language detection: Automatically infers the preferred response language
Recommended Parameters
# Recommended inference parameters
temperature = 0.3 # For more focused responses
max_tokens = 4096 # Adjust based on your needs
top_p = 0.2 # For controlled generation
frequency_penalty = 0 # Prevent repetition
presence_penalty = 1.05 # Encourage diverse responses
Limitations
- Primary focus on Indian languages may limit performance on other languages
- Performance may vary between formal and informal language styles
- Transliterated text quality depends on consistency of transliteration schemes
Ethical Considerations
- Trained to be helpful and respectful across all supported languages
- Aims to preserve cultural nuances and appropriate language use
- Users should be aware of potential biases in multilingual outputs
Acknowledgments
- Built with Unsloth for efficient training
- Based on Sarvam-AI's excellent work on Indian large language models
- Powered by Hugging Face's TRL library
- Downloads last month
- 8
Model tree for Bharatdeep-H/qwen2.5-14b-desi
Base model
Qwen/Qwen2.5-14B Finetuned
Qwen/Qwen2.5-14B-Instruct