Tamazight Causal Language Model Overview TamazightForCausalLM is a custom autoregressive language model trained for Tamazight text generation. This model uses: Custom architecture: TamazightForCausalLM Custom tokenizer based on SentencePiece (.model) Fully integrated with Transformers Supports generate() (sampling, greedy, beam search) Model Details Architecture: Decoder-only Transformer Model type: tamazight Tokenizer: SentencePiece unigram Special Tokens: Files Included Copy code
config.json generation_config.json model.safetensors tokenizer_config.json tamazight.model
configuration_tamazight.py modeling_tamazight.py tokenization_tamazight.py init.py Installation Bash Copy code pip install transformers torch Usage Because this is a custom architecture, you must enable trust_remote_code=True. Python Copy code from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained( "YOUR_USERNAME/tamazight-model", trust_remote_code=True )
model = AutoModelForCausalLM.from_pretrained( "YOUR_USERNAME/tamazight-model", trust_remote_code=True )
inputs = tokenizer("azul fellak", return_tensors="pt")
outputs = model.generate( **inputs, max_new_tokens=50, do_sample=True, temperature=0.8 )
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) Training Trained from scratch Tokenizer trained using SentencePiece Special token IDs aligned with model config Compatible with modern Transformers versions Notes Special token IDs are defined in config.json Generation settings are stored in generation_config.json Embeddings are aligned with tokenizer vocabulary size Fully portable via save_pretrained() / from_pretrained() License Specify your license here (e.g., MIT, Apache 2.0). Acknowledgments Built using the Transformers library by Hugging Face.
- Downloads last month
- 22