HRM-Text1 (WikiText-103)
This repository contains weights for an experimental HRM Causal LM trained on the WikiText-103 dataset.
Model Description
- Architecture: Hierarchical Recurrent Memory (HRM)
- Training Data: wikitext/wikitext-103-raw-v1
- Tokenizer:
t5-small
(slow T5 SentencePiece) - Vocab Size: 32100
- Objective: Causal Language Modeling
Latest Performance (Epoch 30)
- Validation Loss:
4.5848
- Validation Perplexity:
97.98
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for Viharikvs/cmbaopenwebmath
Base model
google-t5/t5-small