BharatVLM
/

AssameseGPT2

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

Metrics Training metrics Community

BharatVLM commited on Jun 5

Commit

3df5c20

·

verified ·

1 Parent(s): 1ef8ffa

Upload folder using huggingface_hub

Files changed (1) hide show

README.md +4 -1

README.md CHANGED Viewed

@@ -22,7 +22,7 @@ This is a GPT-2 language model trained from scratch on Assamese monolingual text
 ## 📖 Model Description
-The Assamese GPT-2 model is based on the standard GPT-2 decoder-only transformer architecture. It is capable of generating grammatically coherent and contextually relevant Assamese text and serves as a foundation for downstream NLP tasks such as:
 - Language modeling
 - Text completion/generation
@@ -55,6 +55,9 @@ Data preprocessing included:
 ## 🧪 Training Procedure
 ### Hyperparameters
 - Learning rate: 5e-5
 - Epochs: 20
 - Batch size: 64

 ## 📖 Model Description
+The Assamese GPT-2 model is based on the standard GPT-2 decoder-only transformer architecture with 12 layers, 12 attention heads, 768 hidden size. It is capable of generating grammatically coherent and contextually relevant Assamese text and serves as a foundation for downstream NLP tasks such as:
 - Language modeling
 - Text completion/generation
 ## 🧪 Training Procedure
 ### Hyperparameters
+- Architecture: GPT2 (12 layers, 12 heads, 768 hidden size)
+- Tokenizer vocab size: 50,000
+- Context window size: 1024 tokens
 - Learning rate: 5e-5
 - Epochs: 20
 - Batch size: 64