microllama-0.06B / README.md
yujiepan's picture
Update README.md
9871009 verified
metadata
license: apache-2.0
base_model:
  - OuteAI/Lite-Oute-1-65M-Instruct
pipeline_tag: text-generation

yujiepan/microllama-0.06B

This is the same model as OuteAI/Lite-Oute-1-65M-Instruct but is converted in FP16.

It is a small pretrained model that can do text generation. Very useful for algorithm development / debugging.

Special thanks to the original author OuteAI for the hard work and contribution.

This repo is just a backup for myself. If you find this model useful, consider using the original repo instead.

Evaluation

lm_eval --model hf \
  --model_args pretrained=yujiepan/microllama-0.06B,max_length=2048,dtype="<dtype>" \
  --tasks wikitext \
  --device cuda:0 \
  --batch_size 1
Model dtype Word perplexity
FP32 59.1905
BF16 59.1187
FP16 59.1902

Tested on A100 with lm-eval==0.4.7.