HRM-Text1 (WikiText-103)

This repository contains weights for an experimental HRM Causal LM trained on the WikiText-103 dataset.

Model Description

  • Architecture: Hierarchical Recurrent Memory (HRM)
  • Training Data: wikitext/wikitext-103-raw-v1
  • Tokenizer: t5-small (slow T5 SentencePiece)
  • Vocab Size: 32100
  • Objective: Causal Language Modeling

Latest Performance (Epoch 30)

  • Validation Loss: 4.5848
  • Validation Perplexity: 97.98
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Viharikvs/cmbaopenwebmath

Base model

google-t5/t5-small
Finetuned
(2119)
this model