{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "NQUk3Y0WwYZ4" }, "source": [ "# 🤗 x 🦾: Training SmolVLA with LeRobot Notebook\n", "\n", "Welcome to the **LeRobot SmolVLA training notebook**! This notebook provides a ready-to-run setup for training imitation learning policies using the [🤗 LeRobot](https://github.com/huggingface/lerobot) library.\n", "\n", "In this example, we train an `SmolVLA` policy using a dataset hosted on the [Hugging Face Hub](https://huggingface.co/), and optionally track training metrics with [Weights & Biases (wandb)](https://wandb.ai/).\n", "\n", "## ⚙️ Requirements\n", "- A Hugging Face dataset repo ID containing your training data (`--dataset.repo_id=YOUR_USERNAME/YOUR_DATASET`)\n", "- Optional: A [wandb](https://wandb.ai/) account if you want to enable training visualization\n", "- Recommended: GPU runtime (e.g., NVIDIA A100) for faster training\n", "\n", "## ⏱️ Expected Training Time\n", "Training with the `SmolVLA` policy for 20,000 steps typically takes **about 5 hours on an NVIDIA A100** GPU. On less powerful GPUs or CPUs, training may take significantly longer!\n", "\n", "## Example Output\n", "Model checkpoints, logs, and training plots will be saved to the specified `--output_dir`. If `wandb` is enabled, progress will also be visualized in your wandb project dashboard.\n" ] }, { "cell_type": "markdown", "metadata": { "id": "MOJyX0CnwA5m" }, "source": [ "## Install conda\n", "This cell uses `condacolab` to bootstrap a full Conda environment inside Google Colab.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "QlKjL1X5t_zM" }, "outputs": [], "source": [ "!pip install -q condacolab\n", "import condacolab\n", "condacolab.install()" ] }, { "cell_type": "markdown", "metadata": { "id": "DxCc3CARwUjN" }, "source": [ "## Install LeRobot\n", "This cell clones the `lerobot` repository from Hugging Face, installs FFmpeg (version 7.1.1), and installs the package in editable mode.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "dgLu7QT5tUik" }, "outputs": [], "source": [ "!git clone https://github.com/huggingface/lerobot.git\n", "!conda install ffmpeg=7.1.1 -c conda-forge\n", "!cd lerobot && pip install -e ." ] }, { "cell_type": "markdown", "metadata": { "id": "Q8Sn2wG4wldo" }, "source": [ "## Weights & Biases login\n", "This cell logs you into Weights & Biases (wandb) to enable experiment tracking and logging." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "PolVM_movEvp" }, "outputs": [], "source": [ "!wandb login" ] }, { "cell_type": "markdown", "metadata": { "id": "zTWQAgX9xseE" }, "source": [ "## Install SmolVLA dependencies" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "DiHs0BKwxseE" }, "outputs": [], "source": [ "!cd lerobot && pip install -e \".[smolvla]\"" ] }, { "cell_type": "markdown", "metadata": { "id": "IkzTo4mNwxaC" }, "source": [ "## Start training SmolVLA with LeRobot\n", "\n", "This cell runs the `train.py` script from the `lerobot` library to train a robot control policy. \n", "\n", "Make sure to adjust the following arguments to your setup:\n", "\n", "1. `--dataset.repo_id=YOUR_HF_USERNAME/YOUR_DATASET`: \n", " Replace this with the Hugging Face Hub repo ID where your dataset is stored, e.g., `pepijn223/il_gym0`.\n", "\n", "2. `--batch_size=64`: means the model processes 64 training samples in parallel before doing one gradient update. Reduce this number if you have a GPU with low memory.\n", "\n", "3. `--output_dir=outputs/train/...`: \n", " Directory where training logs and model checkpoints will be saved.\n", "\n", "4. `--job_name=...`: \n", " A name for this training job, used for logging and Weights & Biases.\n", "\n", "5. `--policy.device=cuda`: \n", " Use `cuda` if training on an NVIDIA GPU. Use `mps` for Apple Silicon, or `cpu` if no GPU is available.\n", "\n", "6. `--wandb.enable=true`: \n", " Enables Weights & Biases for visualizing training progress. You must be logged in via `wandb login` before running this." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "ZO52lcQtxseE" }, "outputs": [], "source": [ "!cd lerobot && python lerobot/scripts/train.py \\\n", " --policy.path=lerobot/smolvla_base \\\n", " --dataset.repo_id=${HF_USER}/mydataset \\\n", " --batch_size=64 \\\n", " --steps=20000 \\\n", " --output_dir=outputs/train/my_smolvla \\\n", " --job_name=my_smolvla_training \\\n", " --policy.device=cuda \\\n", " --wandb.enable=true" ] }, { "cell_type": "markdown", "metadata": { "id": "2PBu7izpxseF" }, "source": [ "## Login into Hugging Face Hub\n", "Now after training is done login into the Hugging Face hub and upload the last checkpoint" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "8yu5khQGIHi6" }, "outputs": [], "source": [ "!huggingface-cli login" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "zFMLGuVkH7UN" }, "outputs": [], "source": [ "!huggingface-cli upload ${HF_USER}/my_smolvla \\\n", " /content/lerobot/outputs/train/my_smolvla/checkpoints/last/pretrained_model" ] } ], "metadata": { "accelerator": "GPU", "colab": { "gpuType": "A100", "machine_shape": "hm", "provenance": [] }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 0 }