--- pipeline_tag: robotics tags: - smolvla library_name: lerobot datasets: - lerobot/svla_so101_pickplace --- ## SmolVLA: A vision-language-action model for affordable and efficient robotics Resources and technical documentation: [SmolVLA Paper](https://huggingface.co/papers/2506.01844) [SmolVLA Blogpost](https://huggingface.co/blog/smolvla) [Code](https://github.com/huggingface/lerobot/blob/main/lerobot/common/policies/smolvla/modeling_smolvla.py) [Train using Google Colab Notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/lerobot/training-smolvla.ipynb#scrollTo=ZO52lcQtxseE) [SmolVLA HF Documentation](https://huggingface.co/docs/lerobot/smolvla) Designed by Hugging Face. This model has 450M parameters in total. You can use inside the [LeRobot library](https://github.com/huggingface/lerobot). Before proceeding to the next steps, you need to properly install the environment by following [Installation Guide](https://huggingface.co/docs/lerobot/installation) on the docs. Install smolvla extra dependencies: ```bash pip install -e ".[smolvla]" ``` Example of finetuning the smolvla pretrained model (`smolvla_base`): ```bash python lerobot/scripts/train.py \ --policy.path=lerobot/smolvla_base \ --dataset.repo_id=lerobot/svla_so101_pickplace \ --batch_size=64 \ --steps=20000 \ --output_dir=outputs/train/my_smolvla \ --job_name=my_smolvla_training \ --policy.device=cuda \ --wandb.enable=true ``` Example of finetuning the smolvla neural network with pretrained VLM and action expert intialized from scratch: ```bash python lerobot/scripts/train.py \ --dataset.repo_id=lerobot/svla_so101_pickplace \ --batch_size=64 \ --steps=200000 \ --output_dir=outputs/train/my_smolvla \ --job_name=my_smolvla_training \ --policy.device=cuda \ --wandb.enable=true ```