Model Overview
RefinedNeuro/RN_TR_R2 is a Turkishβlanguage reasoning model fine-tuned from
ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1
using the GRPO method on distilled Q&A data of Qwen3 model. It excels at open-ended reasoning tasks in Turkish, particularly STEM, history, and cultural questions.
- π Available on:: https://ollama.com/RefinedNeuro/RN_TR_R2
Evaluation
Benchmark: RN_TR_R2_Benchmark_Results (Open-ended Turkish culture & reasoning questions)
Model Name | Score (%) | Description |
---|---|---|
RefinedNeuro/RN_TR_R2 | 82.4 | Upgraded from RN_TR_R1 via GRPO on distilled Qwen3 data |
Qwen3-8B | 76.8 | Off-the-shelf Qwen3-8B |
RefinedNeuro/RN_TR_R1 | 64.8 | Baseline Turkish-fine-tuned model |
ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1 | 63.2 | Turkish-adapted Llama-8B via direct preference optimization |
Meta-Llama-3.1-8B-Instruct | 63.2 | Metaβs Llama-3.1-8B with instruct tuning |
RN_TR_R2 achieves a 17.6 pp improvement over the RN_TR_R1 baseline.
Intended Use
- Primary use cases: Question answering, step-by-step reasoning, educational tools in Turkish (math, physics, chemistry, biology, geometry, trigonometry, statistics, history, culture).
- Not recommended: Generating creative fiction or tasks requiring memorized facts outside its training scope.
- Known limitations: While no systematic failure modes are identified, RN_TR_R2 may hallucinate like all LLMs.
Training Data
Base model:
ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1
Fine-tuning dataset: RefinedNeuro/Qwen3-Reasoning-Distill-Q-A-Dataset
- Domains: Mathematics, Physics, Chemistry, Biology, Geometry, Trigonometry, Statistics
- Grade levels: 6thβ12th grade
- Prompt variations (e.g., long/detailed vs. short/easy; answer must be a single number)
- Size: 13 000 examples (~9 million tokens)
- License: Creative Commons
Training Configuration
- Method: GRPO (Generalized Reinforcement-Preference Optimization) via
unsloth
- Hardware: 2Γ RTX 3090 GPUs for 1 week (~336 GPU-hours)
- Batch size: 1
- Context length: 4 096 tokens (max_seq_length=8 192 at inference)
- Optimizer & scheduler: Cosine LR schedule, peak LR=5Γ10β»βΆ
- Library:
unsloth==2025.3.18
Usage Example
pip install unsloth==2025.3.18
from unsloth import FastLanguageModel
from transformers import TextStreamer
# Load the model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="RefinedNeuro/RN_TR_R2",
max_seq_length=8192,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
SYSTEM_PROMPT = """
Sana sorulan sorulara cevap verirken, adΔ±m adΔ±m dΓΌΕΓΌn ve gerekΓ§elendir.
Γnce <think></think> taglarΔ± arasΔ±nda adΔ±m adΔ±m dΓΌΕΓΌn.
Sonra dΓΌΕΓΌnme adΔ±mlarΔ±nΔ± ΓΆzetle.
En son \\boxed{} iΓ§erisinde sorunun cevabΔ±nΔ± ver.
"""
def answer_question(question: str):
messages = [
{'role': 'system', 'content': SYSTEM_PROMPT},
{'role': 'user', 'content': question},
]
inputs = tokenizer.apply_chat_template(
messages, tokenize=True, add_generation_prompt=True, return_tensors="pt"
).to("cuda")
streamer = TextStreamer(tokenizer)
print("-" * 60)
_ = model.generate(
input_ids=inputs,
streamer=streamer,
max_new_tokens=2048,
temperature=0.6,
top_p=0.1,
)
# Examples
answer_question("TΓΌrkiye'nin baΕkenti neredir?")
answer_question("6 ve 2'nin Γ§arpΔ±mΔ± nedir?")
Sample Q&A
> **Q:** f(x) = 3xβ΄ - 2xΒ³ + 5x - 7 fonksiyonunun x = 1 noktasΔ±ndaki tΓΌrevini bulunuz.
>
> **A:** <think>
1. AdΔ±m: f(x) = 3xβ΄ - 2xΒ³ + 5x - 7 fonksiyonunun tΓΌrevini bulmak iΓ§in, ΓΆnce f(x)'nin tΓΌrevini bulmamΔ±z gerekmektedir.
2. AdΔ±m: TΓΌrevi bulmak iΓ§in fonksiyonu parΓ§alara ayΔ±rarak analiz edeceΔiz.
f(x) = 3xβ΄ - 2xΒ³ + 5x - 7
3. AdΔ±m: TΓΌrevi bulmak iΓ§in her terimi ayrΔ± ayrΔ± tΓΌrevini buluyoruz.
- 3xβ΄: TΓΌrevi 12xΒ³'dΓΌr.
- -2xΒ³: TΓΌrevi -6xΒ²'dir.
- 5x: TΓΌrevi 5'dir.
- -7: TΓΌrevi 0'dΔ±r (Γ§ΓΌnkΓΌ sabit bir sayΔ±dΔ±r).
4. AdΔ±m: TΓΌrevi bulmak iΓ§in elde edilen tΓΌrevleri birleΕtiriyoruz.
f'(x) = 12xΒ³ - 6xΒ² + 5
5. AdΔ±m: Εimdi x = 1 yerine koyarak f'(x)'nin tΓΌrevini hesaplayacaΔΔ±z.
f'(1) = 12(1)Β³ - 6(1)Β² + 5
f'(1) = 12(1) - 6(1) + 5
f'(1) = 12 - 6 + 5
f'(1) = 11
6. AdΔ±m: x = 1 noktasΔ±ndaki tΓΌrevi 11 olarak bulduk.
\boxed{11}
</think>
\boxed{11}
Citation
If you use this model, please cite:
@misc{farsak2025rn_tr_r2,
title = {RefinedNeuro/RN\_TR\_R2: A Turkish Reasoning Model},
author = {Mehmet Can Farsak and Serhat Atayeter and RefinedNeuro},
year = {2025},
howpublished = {\url{https://huggingface.co/RefinedNeuro/RN_TR_R2}},
license = {Apache-2.0}
}
License
Apache 2.0 β the most permissive open-source license.
Authors
- Authors: Mehmet Can Farsak, Serhat Atayeter, RefinedNeuro
- Downloads last month
- 936
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support