πŸš€ Qwen3-30M TinyStories Pretrained (FP16) - Notebook Version

Pretrained Qwen3-30M model on TinyStories dataset using FP16 precision in notebook environment.

πŸ“Š Training Results

  • Final Training Loss: 1.5244
  • Final Validation Loss: 1.5601832866668701
  • Training Samples: -1
  • Epochs: 3
  • Precision: FP16
  • Dataset: TinyStories (child-friendly stories)

πŸš€ Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("Mostafa8Mehrabi/qwen3-30m-tinystories-final")
model = AutoModelForCausalLM.from_pretrained(
    "Mostafa8Mehrabi/qwen3-30m-tinystories-final", 
    torch_dtype=torch.float16,
    device_map="auto"
)

# Generate a story
prompt = "Once upon a time, there was a little girl named"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

πŸ“ Checkpoints

Training checkpoints (also in FP16) are available at: Mostafa8Mehrabi/qwen3-30m-tinystories-checkpoints

πŸ“– About TinyStories Dataset

The TinyStories dataset contains simple, child-friendly stories that are perfect for:

  • Story generation
  • Child-safe content creation
  • Educational applications
  • Creative writing assistance

πŸ”§ Training Environment

This model was trained in a notebook environment with the following configuration:

  • Batch Size: 128
  • Learning Rate: 5e-05
  • Max Length: 512
  • Number of Processes: 8
Downloads last month
35
Safetensors
Model size
34.7M params
Tensor type
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Mostafa8Mehrabi/qwen3-30m-tinystories-final

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(1)
this model