atrost/nanochat-d24-nested-k4-alpha01-20260425-k1

This is a standalone nanochat-native checkpoint extracted from atrost/nanochat-d24-nested-k4-alpha01-20260425/full/d24_nested_k4_alpha01_full_20260425_155507 at step 003624.

  • Family: nested
  • Submodel: k1
  • Config: {"attn_alpha_init_value": 1.0, "dec_alpha_init_value": 1.0, "ffn_alpha_init_value": 1.0, "n_embd": 384, "n_head": 6, "n_kv_head": 6, "n_layer": 24, "nested_block_heads": 6, "sequence_len": 2048, "vocab_size": 32768, "window_pattern": "L"}

Load it into the local nanochat checkpoint layout before evaluation:

from nanochat.hf_utils import stage_hf_checkpoint

stage_hf_checkpoint(
    repo_id="atrost/nanochat-d24-nested-k4-alpha01-20260425-k1",
    path_in_repo="d24_nested_k4_alpha01_20260425_k1",
    model_tag="d24_nested_k4_alpha01_20260425_k1",
    step=3624,
    base_dir="/tmp/nanochat-eval",
)

Then run eval with:

NANOCHAT_BASE_DIR=/tmp/nanochat-eval python -m scripts.base_eval --model-tag d24_nested_k4_alpha01_20260425_k1 --step 3624
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support