OpenOranje/TweeTaal-nl-en-0.6B

Model Description

The TweeTaal-en-nl model has been fine-tuned on Dutch-English & English-Dutch translation pairs to provide accurate, fluent translations. The compact 0.6B parameter size makes it suitable for deployment in resource-constrained environments while maintaining strong translation quality.

Intended Use

Primary Use Case: Translating Dutch text to English / English text to Dutch across various domains

Recommended Applications:

General-purpose Dutch-to-English and English-to-Dutch translation
Content localization
Cross-lingual communication tools
Educational language learning applications

Performance

Benchmark Results

Training Details

Training Procedure

Method: Supervised Fine-Tuning (SFT)

The model was trained on parallel Dutch-English text pairs
Standard cross-entropy loss optimization
The base Qwen3-0.6b model was adapted specifically for translation tasks

Training Data

The model was trained on Dutch-English parallel corpora. (Note: Specify your actual dataset details, such as:

Dataset name and source
Number of training examples
Domain coverage (general, technical, literary, etc.)
Data preprocessing steps)

Usage

Basic Usage Example

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load model and tokenizer
model_name = "OpenOranje/qwen3-0.6b-dutch-english"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Prepare input
dutch_text = "Hallo, hoe gaat het met je?"
prompt = f"Translate from Dutch to English:\n{dutch_text}"
message = [{"role":"user", "content": prompt}]
# Generate translation
inputs = tokenizer.apply_chat_template(message, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=128, temperature=0.7)
translation = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(translation)

Prompt Format

The model expects input in the following format:

Translate the following text from Dutch to English:\n{dutch_text}

Translate the following text from English to Dutch:\n{english_text}

Inference Parameters

Recommended generation parameters:

Temperature: 0.7 (adjust for creativity vs. consistency)
Max tokens: Set based on expected translation length
Top-p: 0.9 (nucleus sampling)

Limitations

Context Length: Trained on 4096 Tokens
Rare Words: May struggle with highly specialized terminology or rare vocabulary not well-represented in training data
Informal Language: Performance on slang, dialects, or very informal Dutch may vary

Ethical Considerations

Training Data Bias: The model may reflect biases present in the training data
Cultural Nuances: Some cultural expressions may not translate perfectly

Contact

For questions or issues, please contact: [[email protected]][[email protected]]

Additional Resources

Base Model: Qwen3-0.6B
Training Code: [TBD]
Dataset: Data

Version History

v1.0 (2025-10-24): Initial release

License: [Apache 2.0]

Downloads last month: 588

Safetensors

Model size

0.6B params

Tensor type

F32

Model tree for OpenOranje/TweeTaal-nl-en-0.6B

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-0.6B