Overview
This is a fine-tuned version of the LLaMA-3.3-70B-Instruct model with a focus on Turkish language processing, leveraging LoRA (Low-Rank Adaptation) for efficient adaptation to specialized tasks. The model is trained to handle causal language modeling tasks, excelling at text generation, comprehension, and structured reasoning in Turkish.
Model Description
- Model Type: LLaMA (Large Language Model Meta AI)
- Version: 3.3-70B-Instruct
- Number of Parameters: 70 Billion
- Pretraining Task: Causal Language Modeling
- Training Objective: To fine-tune the base LLaMA model for improved Turkish language understanding with LoRA.
- Tokenizer: Special tokenizer adapted to Turkish, enabling efficient tokenization of the language.
- Training Data: A specialized Turkish corpus tailored for tasks such as reasoning, comprehension, and structured output generation.
Intended Use
This model is designed for tasks requiring Turkish text generation and comprehension, such as:
- Text Generation: Generating coherent, contextually relevant text in Turkish.
- Question Answering: Answering questions posed in Turkish, leveraging both the fine-tuned model and structural task handling.
- Text Summarization: Summarizing complex Turkish texts into concise outputs.
- Dialogue Systems: Enabling interactive dialogue systems that can converse in Turkish.
Model Parameters
- Max Position Embeddings: 131072
- Number of Attention Heads: 64
- Number of Hidden Layers: 80
- Vocab Size: 128256
- Pretraining TP: 1
- Rope Scaling Factors:
- High Frequency: 4.0
- Low Frequency: 1.0
- Tie Word Embeddings: False
- RMS Norm EPS: 1e-05
- Training Arguments:
- Epochs: 3
- Training Loss: 0.0995
- Training Samples per Second: 3.417
- Training Steps per Second: 0.027
- Training Runtime: 6 days, 4:59:46.09
Training Results
- Epoch: 3.0
- Total FLOPs: 224.76 TFLOPS (224,757,703 GF)
- Train Loss: 0.0995
- Train Runtime: 6 days, 4:59:46.09
- Train Samples per Second: 3.417
- Train Steps per Second: 0.027
- Training Loss Figure:

Evaluation Results
The model was evaluated on a dataset containing 67,882 examples. The evaluation results are as follows:
- Eval Loss: 0.108
- Eval Runtime: 2:39:22.40
- Eval Samples per Second: 7.099
- Eval Steps per Second: 0.887
Final performance was benchmarked using the Mezura🥇 framework — a standardized evaluation suite developed by NewmindAI for structured Turkish NLP tasks.
Usage Example
To use the model for text generation in Turkish, you can load it with the transformers library like so:
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
base_model_id = "meta-llama/Meta-Llama-3-70B-Instruct"
adapter_id = "newmindai/Llama-3.3-70b-Instruct"
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
base_model = AutoModelForCausalLM.from_pretrained(
base_model_id,
torch_dtype=torch.float16,
device_map="auto"
)
model = PeftModel.from_pretrained(base_model, adapter_id)
prompt = "Tarhana en çok hangi il ile özdeşleşmiştir?"
# Inference
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
- Downloads last month
- 4