OpenOranje/TweeTaal-nl-en-0.6B

Model Description

The TweeTaal-en-nl model has been fine-tuned on Dutch-English & English-Dutch translation pairs to provide accurate, fluent translations. The compact 0.6B parameter size makes it suitable for deployment in resource-constrained environments while maintaining strong translation quality.

Intended Use

Primary Use Case: Translating Dutch text to English / English text to Dutch across various domains

Recommended Applications:

  • General-purpose Dutch-to-English and English-to-Dutch translation
  • Content localization
  • Cross-lingual communication tools
  • Educational language learning applications

Performance

Benchmark Results

Benchmarks

Training Details

Training Procedure

Method: Supervised Fine-Tuning (SFT)

  • The model was trained on parallel Dutch-English text pairs
  • Standard cross-entropy loss optimization
  • The base Qwen3-0.6b model was adapted specifically for translation tasks

Training Data

The model was trained on Dutch-English parallel corpora. (Note: Specify your actual dataset details, such as:

  • Dataset name and source
  • Number of training examples
  • Domain coverage (general, technical, literary, etc.)
  • Data preprocessing steps)

Usage

Basic Usage Example

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load model and tokenizer
model_name = "OpenOranje/qwen3-0.6b-dutch-english"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Prepare input
dutch_text = "Hallo, hoe gaat het met je?"
prompt = f"Translate from Dutch to English:\n{dutch_text}"
message = [{"role":"user", "content": prompt}]
# Generate translation
inputs = tokenizer.apply_chat_template(message, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=128, temperature=0.7)
translation = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(translation)

Prompt Format

The model expects input in the following format:

Translate the following text from Dutch to English:\n{dutch_text}
Translate the following text from English to Dutch:\n{english_text}

Inference Parameters

Recommended generation parameters:

  • Temperature: 0.7 (adjust for creativity vs. consistency)
  • Max tokens: Set based on expected translation length
  • Top-p: 0.9 (nucleus sampling)

Limitations

  • Context Length: Trained on 4096 Tokens
  • Rare Words: May struggle with highly specialized terminology or rare vocabulary not well-represented in training data
  • Informal Language: Performance on slang, dialects, or very informal Dutch may vary

Ethical Considerations

  • Training Data Bias: The model may reflect biases present in the training data
  • Cultural Nuances: Some cultural expressions may not translate perfectly

Contact

For questions or issues, please contact: [[email protected]][[email protected]]


Additional Resources

Version History

  • v1.0 (2025-10-24): Initial release

License: [Apache 2.0]

Downloads last month
588
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for OpenOranje/TweeTaal-nl-en-0.6B

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(406)
this model

Dataset used to train OpenOranje/TweeTaal-nl-en-0.6B