Skip to content

[NeurIPS 2025] AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning

Notifications You must be signed in to change notification settings

ucla-mobility/AutoVLA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 

Repository files navigation

AutoVLA

website paper dataset

[NeurIPS 2025] This is the official implementation of the paper:

AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning

Zewei Zhou*, Tianhui Cai*, Seth Z. Zhao, Yun Zhang, Zhiyu Huang†, Bolei Zhou, Jiaqi Ma

University of California, Los Angeles - * Equal contribution, † Project leader

teaser

  • πŸš— AutoVLA integrates chain-of-thought (CoT) reasoning and physical action tokenization to directly generate planning trajectories through a unified autoregressive process, dynamically switching dual-thinking modes.
  • βš™οΈ Supervised fine-tuning (SFT) is employed to equip the model with dual thinking modes: fast thinking (trajectory-only) and slow thinking (enhanced with chain-of-thought reasoning).
  • πŸͺœ Reinforcement fine-tuning (RFT) based on Group Relative Policy Optimization (GRPO) is adopted to further enhance planning performance and efficiency, reducing unnecessary reasoning in straightforward scenarios.
  • πŸ”₯ Extensive experiments across real-world and simulated datasets and benchmarks, including nuPlan, nuScenes, Waymo, and CARLA, demonstrate its competitive performance in both open-loop and closed-loop settings.

News

Release Plan

  • 2025/06: βœ… AutoVLA paper.
  • 2025/11: Reasoning data.
  • 2025/11: Reasoning annotation code.
  • 2025/11: AutoVLA code.
  • 2025/11: AutoVLA checkpoints.

Citation

If you find this repository useful for your research, please consider giving us a star 🌟 and citing our paper.

@article{zhou2025autovla,
 title={AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning},
 author={Zhou, Zewei and Cai, Tianhui and Zhao, Seth Z.and Zhang, Yun and Huang, Zhiyu and Zhou, Bolei and Ma, Jiaqi},
 journal={arXiv preprint arXiv:2506.13757},
 year={2025}
}

About

[NeurIPS 2025] AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published