L3-8B-Stheno Vietnamese LoRA Adapter
This is a QLoRA adapter for Sao10K/L3-8B-Stheno-v3.2 fine-tuned on Vietnamese instructions dataset.
Model Details
Model Description
A Vietnamese language adapter for L3-8B-Stheno-v3.2, trained using QLoRA (4-bit quantization) to enable Vietnamese language capabilities while maintaining the base model's strengths.
- Developed by: Petermantt
- Model type: LoRA Adapter for Causal Language Model
- Language(s) (NLP): Vietnamese, English
- License: Apache 2.0
- Finetuned from model: Sao10K/L3-8B-Stheno-v3.2
Model Sources
- Repository: https://github.com/Petermantt/vietnamese-llm-lora
- Base Model: https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2
Uses
Direct Use
This adapter is designed for Vietnamese text generation, instruction following, and conversational AI. It can be used for:
- Vietnamese chatbots and assistants
- Content generation in Vietnamese
- Translation assistance
- Educational applications
Quick Start
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch
# QLoRA config
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
)
# Load base model
model = AutoModelForCausalLM.from_pretrained(
"Sao10K/L3-8B-Stheno-v3.2",
quantization_config=bnb_config,
device_map="auto",
trust_remote_code=True,
)
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("Sao10K/L3-8B-Stheno-v3.2", trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
# Load LoRA adapter
model = PeftModel.from_pretrained(model, "Petermantt/L3-8B-Stheno-Vietnamese-LoRA")
# Generate
prompt = "<|im_start|>user\nXin chào! Bạn khỏe không?<|im_end|>\n<|im_start|>assistant\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=200,
temperature=0.7,
do_sample=True,
pad_token_id=tokenizer.pad_token_id,
)
response = tokenizer.decode(outputs[0], skip_special_tokens=False)
print(response)
Out-of-Scope Use
- Not suitable for critical applications without human oversight
- Should not be used for generating harmful or misleading content
- May have biases from training data
Bias, Risks, and Limitations
- May reflect biases present in the Vietnamese Alpaca dataset
- Performance may vary on specialized domains not covered in training
- Inherits limitations from the base L3-8B-Stheno model
- Best performance with Vietnamese instructions, English capability maintained but not enhanced
Recommendations
- Always verify outputs for factual accuracy
- Use with human oversight for important applications
- Consider domain-specific fine-tuning for specialized use cases
- Test thoroughly before production deployment
Compatibility
This LoRA adapter should work with:
- ✅ Sao10K/L3-8B-Stheno-v3.2 (tested)
- ✅ Other Llama-3 8B models with same architecture
- ⚠️ May work with other Stheno variants (untested)
Requirements
- Same tokenizer as base model
- Compatible model architecture (Llama-3 8B)
- 4-bit quantization support
Training Details
Training Data
The adapter was trained on Vietnamese translations of the Alpaca dataset (~20,000 instructions), containing diverse instruction-following examples including:
- General knowledge Q&A
- Creative writing
- Problem-solving
- Code generation (basic)
- Conversational responses
Training Procedure
Training Configuration
- Base Model: Sao10K/L3-8B-Stheno-v3.2
- Training Method: QLoRA (4-bit quantization)
- LoRA Config:
- Rank: 32
- Alpha: 64
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- Dropout: 0.05
Training Hyperparameters
- Training regime: fp16 mixed precision
- Batch size: 1 (with gradient accumulation = 4)
- Learning rate: 2e-4 with cosine scheduler
- Warmup steps: 50
- Total steps: 1,250
- Max sequence length: 1,024
Training Infrastructure
- Hardware: NVIDIA RTX 3060 12GB
- Training time: ~2 hours 17 minutes
- Framework: PyTorch 2.5.1, Transformers 4.52.4, PEFT 0.15.2
Evaluation
Training Results
- Final Loss: 0.8693
- Final Accuracy: 78.8%
- Total Steps: 1,250
Training Progress
- Starting: Loss 1.68, Accuracy 64.8%
- Step 500: Loss 1.17, Accuracy 72.4%
- Step 1000: Loss 0.94, Accuracy 77.5%
- Final: Loss 0.87, Accuracy 78.8%
Example Outputs
Vietnamese Chat:
User: Xin chào, bạn có thể giới thiệu về Việt Nam không?
Assistant: Việt Nam, còn được gọi là Cộng hòa Xã hội Chủ nghĩa Việt Nam, là một quốc gia nằm ở Đông Nam Á với diện tích 331.699 km2 và dân số khoảng 98 triệu người...
Model Examination [optional]
[More Information Needed]
Environmental Impact
- Hardware Type: NVIDIA RTX 3060 12GB
- Hours used: ~2.3 hours
- Cloud Provider: Local training
- Compute Region: N/A
- Carbon Emitted: Minimal due to short training time and efficient QLoRA method
Technical Specifications [optional]
Model Architecture and Objective
[More Information Needed]
Compute Infrastructure
[More Information Needed]
Hardware
[More Information Needed]
Software
[More Information Needed]
Citation
If you use this model, please cite:
BibTeX:
@misc{stheno-vietnamese-lora-2024,
author = {Petermantt},
title = {L3-8B-Stheno Vietnamese LoRA Adapter},
year = {2024},
publisher = {HuggingFace},
url = {https://huggingface.co/Petermantt/L3-8B-Stheno-Vietnamese-LoRA}
}
Glossary [optional]
[More Information Needed]
More Information [optional]
[More Information Needed]
Model Card Authors
Petermantt
Model Card Contact
Please open an issue on the HuggingFace repository for questions or concerns.
Framework versions
- PEFT 0.15.2
- Transformers 4.52.4
- PyTorch 2.5.1
- Datasets 3.6.0
- Tokenizers 0.21.1
- Downloads last month
- 3
Model tree for Petermantt/L3-8B-Stheno-Vietnamese-LoRA
Base model
Sao10K/L3-8B-Stheno-v3.2