Tamazight Causal Language Model Overview TamazightForCausalLM is a custom autoregressive language model trained for Tamazight text generation. This model uses: Custom architecture: TamazightForCausalLM Custom tokenizer based on SentencePiece (.model) Fully integrated with Transformers Supports generate() (sampling, greedy, beam search) Model Details Architecture: Decoder-only Transformer Model type: tamazight Tokenizer: SentencePiece unigram Special Tokens: Files Included Copy code

config.json generation_config.json model.safetensors tokenizer_config.json tamazight.model

configuration_tamazight.py modeling_tamazight.py tokenization_tamazight.py init.py Installation Bash Copy code pip install transformers torch Usage Because this is a custom architecture, you must enable trust_remote_code=True. Python Copy code from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained( "YOUR_USERNAME/tamazight-model", trust_remote_code=True )

model = AutoModelForCausalLM.from_pretrained( "YOUR_USERNAME/tamazight-model", trust_remote_code=True )

inputs = tokenizer("azul fellak", return_tensors="pt")

outputs = model.generate( **inputs, max_new_tokens=50, do_sample=True, temperature=0.8 )

print(tokenizer.decode(outputs[0], skip_special_tokens=True)) Training Trained from scratch Tokenizer trained using SentencePiece Special token IDs aligned with model config Compatible with modern Transformers versions Notes Special token IDs are defined in config.json Generation settings are stored in generation_config.json Embeddings are aligned with tokenizer vocabulary size Fully portable via save_pretrained() / from_pretrained() License Specify your license here (e.g., MIT, Apache 2.0). Acknowledgments Built using the Transformers library by Hugging Face.

Downloads last month: 22

Safetensors

Model size

98.3M params

Tensor type

F32

Velkamez
/

tamazight-unigram

Space using Velkamez/tamazight-unigram 1