--- license: mit base_model: - gen-robot/openvla-7b-rlvla-warmup --- # VLA-RL-Study: What Can RL Bring to VLA Generalization? An Empirical Study [![arXiv](https://img.shields.io/badge/arXiv-2505.19789-red.svg)](http://arxiv.org/abs/2505.19789) [![Website](https://img.shields.io/badge/Website-RLVLA-green.svg)](https://rlvla.github.io) This is the RL model, fine-tuned from the [warm-upped OpenVLA model](https://huggingface.co/gen-robot/openvla-7b-rlvla-warmup). The RL training takes about 1.5M environment steps. For more details, please refer to the [codebase](https://github.com/gen-robot/RL4VLA) and the [paper](http://arxiv.org/abs/2505.19789).