| --- |
| license: mit |
| base_model: JackFram/llama-68m |
| tags: |
| - tiny-model |
| - random-weights |
| - testing |
| - llama |
| --- |
| |
| # Llama-3.3-Tiny-Instruct |
|
|
| This is a tiny random version of the JackFram/llama-68m model, created for testing and experimentation purposes. |
|
|
| ## Model Details |
|
|
| - **Base model**: JackFram/llama-68m |
| - **Seed**: 42 |
| - **Hidden size**: 768 |
| - **Number of layers**: 2 |
| - **Number of attention heads**: 12 |
| - **Vocabulary size**: 32000 |
| - **Max position embeddings**: 2048 |
|
|
| ## Parameters |
|
|
| - **Total parameters**: ~43,454,976 |
| - **Trainable parameters**: ~43,454,976 |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoModelForSequenceClassification, AutoTokenizer |
| |
| # Load model and tokenizer |
| model = AutoModelForSequenceClassification.from_pretrained("AlignmentResearch/Llama-3.3-Tiny-Classifier") |
| tokenizer = AutoTokenizer.from_pretrained("AlignmentResearch/Llama-3.3-Tiny-Classifier") |
| |
| # Generate text (note: this model has random weights!) |
| inputs = tokenizer("Hello, how are you?", return_tensors="pt") |
| outputs = model.generate(**inputs, max_length=50) |
| print(tokenizer.decode(outputs[0])) |
| ``` |
|
|
| ## Important Notes |
|
|
| ⚠️ **This model has random weights and is not trained!** It's designed for: |
| - Testing model loading and inference pipelines |
| - Benchmarking model architecture |
| - Educational purposes |
| - Rapid prototyping where actual model performance isn't needed |
|
|
| The model will generate random/nonsensical text since it hasn't been trained on any data. |
|
|
| ## Creation |
|
|
| This model was created using the `upload_tiny_llama33.py` script from the minimal-grpo-trainer repository. |
|
|