File size: 4,634 Bytes
64c04d6 bdc291e 11f73cd 64c04d6 bdc291e 64c04d6 aeb61b3 64c04d6 169fa37 64c04d6 cd854be 64c04d6 e53206b 64c04d6 169fa37 64c04d6 169fa37 64c04d6 169fa37 dd11fd0 64c04d6 169fa37 64c04d6 169fa37 64c04d6 169fa37 64c04d6 169fa37 64c04d6 c661884 64c04d6 c661884 64c04d6 c661884 169fa37 64c04d6 c661884 64c04d6 c661884 64c04d6 c661884 64c04d6 cd854be 64c04d6 169fa37 64c04d6 169fa37 64c04d6 169fa37 64c04d6 169fa37 bd40d57 b08925d bd40d57 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 |
---
license: apache-2.0
tags:
- unsloth
- LoRA
- trl
- hinglish
- text-generation-inference
datasets:
- fhai50032/Hinglish-CoT-General
language:
- en
base_model:
- unsloth/Meta-Llama-3.1-8B
pipeline_tag: text-generation
library_name: adapter-transformers
---
# π§ Llama-3.1-8B-Hinglish-General-sft
**Llama-3.1-8b-Hinglish-General-sft** is a lightweight, domain-specific fine-tuned model built for **conversational Hinglish-style reasoning** with a focus on general and basic Hinglish knowledge. It builds upon `Meta-Llama-3.1-8B` and uses **LoRA adapters** for efficient fine-tuning with **Unsloth**.
> β οΈ This model is a demonstration of supervised fine-tuning and is intended solely for educational and informational purposes. It is not validated for critical applications and should not be used for real-life decision-making.
---
## π Model Summary
- **Base Model:** [`unsloth/Meta-Llama-3.1-8B`](https://huggingface.co/unsloth/Meta-Llama-3.1-8B)
- **LoRA Adapter:** `Subh775/Llama-3.1-8b-Hinglish-General-sft`
- **Fine-tuned Dataset:** [`fhai50032/Hinglish-CoT-General`](https://huggingface.co/datasets/fhai50032/Hinglish-CoT-General)
- **Language:** Hinglish (Hindi-English mix)
- **Training Time:** 49.24 minutes (1 epoch)
- **Framework:** [Unsloth](https://github.com/unslothai/unsloth)
- **Quantization:** 4-bit (for efficient inference)
---
## π‘ Key Features
- π£οΈ **Hinglish-CoT Reasoning:** Trained on ~2K question-answer pairs with step-by-step reasoning in Hinglish.
- βοΈ **Efficient Inference:** Enabled by LoRA + Unsloth + 4-bit quantization.
- π **Fast and Lightweight:** Optimized for quick inference even on limited hardware.
---
## π οΈ Inference Instructions
### π§ Installation
```python
pip install unsloth
```
```python
from unsloth import FastLanguageModel
import torch
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{question}
### Input:
{thoughts}
### Response:
{answer}"""
# Load model
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="Subh775/Llama-3.1-8b-Hinglish-General-sft",
max_seq_length=2048,
load_in_4bit=True
)
FastLanguageModel.for_inference(model)
```
```python
import re
def clean_response(text):
if "### Response:" in text:
text = text.split("### Response:")[-1]
lines = text.strip().splitlines()
clean_lines = [line.strip() for line in lines if not re.match(r"^(#|input:|response:|Input:|Response:)", line, re.IGNORECASE)]
return " ".join(clean_lines).strip()
def chat():
print("π©Ί Chat with Llama-3.1-8b-Hinglish-General-sft! Type '\\q' or 'quit' to stop.\n")
chat_history = ""
while True:
user_input = input("β€ ")
if user_input.lower() in ['\\q', 'quit']:
print("\nExiting the chat. Goodbye π§ β¨!")
print("β¨" + "=" * 30 + "β¨\n")
break
question = user_input
thoughts = "User is asking a genuine question. Thinking step-by-step in Hinglish."
prompt = alpaca_prompt.format(question=question, thoughts=thoughts, answer="")
chat_history += prompt + "\n"
inputs = tokenizer([chat_history], return_tensors="pt").to("cuda")
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.7,
top_p=0.9,
num_return_sequences=1,
do_sample=True,
no_repeat_ngram_size=2
)
decoded_output = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
clean_output = clean_response(decoded_output)
chat_history += f"{clean_output}\n"
print(f"\nβοΈ: {clean_output}\n")
chat()
```
## π Training details
- Dataset Used: Hinglish-CoT-General
- Total Samples: 2,015 examples
- Training Time: ~49 minutes (on 1 epoch)
- Final Step: 60
- Final Training Loss: 0.776
## β οΈ Limitations
- π§ Generalized understanding β may not reflect recent advancements
- The dataset used for finetuning is too short and hence model responses is not as accurate.
## π License
This model is licensed under the Apache 2.0 License, same as its base model.
## π Citation
```bibtex
@misc{llama3_8b_hinglish_general_2025,
author = {Subh775},
title = {Llama-3.1 8B Hinglish General SFT},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/Subh775/Llama-3.1-8b-Hinglish-General-sft}},
note = {Hugging Face Repository}
}
```
|