OSTLM-V1: English to Hebrew Translator 🚀

A Neural Machine Translation (NMT) model based on a custom Transformer (Encoder-Decoder) architecture, trained from scratch. This model is designed to translate English sentences into Hebrew using multilingual encoding and specialized layer configurations.

📝 Model Description

OSTLM (Open Source Translation Language Model) demonstrates translation capabilities in a compact and efficient format. Unlike generic pre-trained models, this model was trained specifically on English-Hebrew pairs to capture the nuances of the language transition.

Architecture: Custom Transformer (6 Encoder layers, 6 Decoder layers).
Dataset: Trained on the OPUS-100 (en-he) dataset.
Tokenization: Uses BertTokenizer (multilingual-cased).

🚀 How to Use (Inference)

The model performs best when output length is controlled via inference parameters. This prevents hallucinations and ensures the translation stops at the correct point.

import torch
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_id = "Raziel1234/ostlm"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id, trust_remote_code=True)

device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)
model.eval()

def translate(text, max_tokens=10):
    inputs = tokenizer(text, return_tensors="pt", padding=True).to(device)
    
    with torch.no_grad():
        outputs = model.generate(
            input_ids=inputs["input_ids"],
            max_new_tokens=max_tokens,
            num_beams=5,
            repetition_penalty=2.5,
            no_repeat_ngram_size=2,
            decoder_start_token_id=tokenizer.cls_token_id,
            eos_token_id=tokenizer.sep_token_id,
            pad_token_id=tokenizer.pad_token_id
        )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

if __name__ == "__main__":
    english_text = "Hey! "
    print(f"EN: {english_text}")
    print(f"HE: {translate(english_text, max_tokens=3)}")

Downloads last month: 147

Safetensors

Model size

0.2B params

Tensor type

F32

Raziel1234
/

OSTLM

OSTLM-V1: English to Hebrew Translator 🚀

📝 Model Description

🚀 How to Use (Inference)

Dataset used to train Raziel1234/OSTLM

Space using Raziel1234/OSTLM 1