WorldVLN Model Weights

This repository contains the model weights introduced in the paper: [WorldVLN: Autoregressive World Action Model for Aerial Vision-Language Navigation].

It includes the weights for the world model backbone and the action decoder.

For more details about the model and its implementation, please refer to the GitHub repository: https://github.com/EmbodiedCity/WorldVLN.code

Citation

If this work has contributed to your research, welcome to cite it:

@misc{zhao2026worldvln,
      title={WorldVLN: Autoregressive World Action Model for Aerial Vision-Language Navigation},
      author={Baining Zhao and Jiacheng Xu and Weicheng Feng and Xin Zhang and Zhaolu Wang and Haoyang Wang and Shilong Ji and Ziyou Wang and Jianjie Fang and Zhiheng Zheng and Weichen Zhang and Yu Shang and Wei Wu and Chen Gao and Xinlei Chen and Yong Li},
      year={2026},
      eprint={2605.15964},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2605.15964},
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for EmbodiedCity/WorldVLN

WorldVLN: Autoregressive World Action Model for Aerial Vision-Language Navigation

Paper • 2605.15964 • Published 4 days ago