---
license: mit
base_model:
- gen-robot/openvla-7b-rlvla-warmup
---
# VLA-RL-Study: What Can RL Bring to VLA Generalization? An Empirical Study

[![arXiv](https://img.shields.io/badge/arXiv-2505.19789-red.svg)](http://arxiv.org/abs/2505.19789)
[![Website](https://img.shields.io/badge/Website-RLVLA-green.svg)](https://rlvla.github.io)

This is the RL model, fine-tuned from the [warm-upped OpenVLA model](https://huggingface.co/gen-robot/openvla-7b-rlvla-warmup).
The RL training takes about 1.5M environment steps.
For more details, please refer to the [codebase](https://github.com/gen-robot/RL4VLA) and the [paper](http://arxiv.org/abs/2505.19789).