Sparse Autoencoder (SAE) Model

This model is a Sparse Autoencoder trained for interpretability analysis of robotics policies using the LeRobot framework.

Model Details

Architecture: Multi-modal Sparse Autoencoder
Training Dataset: villekuosmanen/build_block_tower, villekuosmanen/fail_build_block_tower_stationary, villekuosmanen/build_block_tower_val, villekuosmanen/dAgger_build_block_tower_1.4.0, villekuosmanen/dAgger_build_block_tower_dino
Base Policy: LeRobot ACT policy
Layer Target: model.encoder.layers.3.norm2
Tokens: 77
Token Dimension: 128
Feature Dimension: 12320
Expansion Factor: 1.25

Training Configuration

Learning Rate: 0.0001
Batch Size: 16
L1 Penalty: 0.3
Epochs: 10
Optimizer: adam

Usage

from physical_ai_interpretability.sae.trainer import load_sae_from_hub

# Load model from Hub
model = load_sae_from_hub("villekuosmanen/build_block_tower_all_small_sae")

# Or load using builder
from physical_ai_interpretability.sae.builder import SAEBuilder
builder = SAEBuilder(device='cuda')
model = builder.load_from_hub("villekuosmanen/build_block_tower_all_small_sae")

Out-of-Distribution Detection

This SAE model can be used for OOD detection with LeRobot policies:

from physical_ai_interpretability.ood import OODDetector

# Create OOD detector with Hub-loaded SAE
ood_detector = OODDetector(
    policy=your_policy,
    sae_hub_repo_id="villekuosmanen/build_block_tower_all_small_sae"
)

# Fit threshold and use for detection
ood_detector.fit_ood_threshold_to_validation_dataset(validation_dataset)
is_ood, error = ood_detector.is_out_of_distribution(observation)

Files

model.safetensors: The trained SAE model weights
config.json: Training and model configuration
training_state.pt: Complete training state (optimizer, scheduler, metrics)
ood_params.json: OOD detection parameters (if fitted)

Citation

If you use this model in your research, please cite:

@misc{sae_model,
  title={Sparse Autoencoder for Build Block Tower},
  author={Your Name},
  year={2024},
  url={https://huggingface.co/villekuosmanen/build_block_tower_all_small_sae}
}

Framework

This model was trained using the physical-ai-interpretability framework with LeRobot.

Downloads last month: 6

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

villekuosmanen
/

build_block_tower_all_small_sae