Viharikvs
/

cmbaopenwebmath

Model card Files Files and versions

HRM-Text1 (WikiText-103)

This repository contains weights for an experimental HRM Causal LM trained on the WikiText-103 dataset.

Model Description

Architecture: Hierarchical Recurrent Memory (HRM)
Training Data: wikitext/wikitext-103-raw-v1
Tokenizer: t5-small (slow T5 SentencePiece)
Vocab Size: 32100
Objective: Causal Language Modeling

Latest Performance (Epoch 30)

Validation Loss: 4.5848
Validation Perplexity: 97.98

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Viharikvs/cmbaopenwebmath

Base model

google-t5/t5-small

Finetuned

(2119)

this model