CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video
Lingen Li, Guangzhi Wang, Xiaoyu Li, Zhaoyang Zhang, Qi Dou, Jinwei Gu, Tianfan Xue, Ying Shan
CVPR 2026
TL;DR: Generate one cubemap face per time window with an effective and efficient context mechanism. Then, perspective video becomes 4K 360° without the memory blow‑up or the low‑res‑then‑upscale.
For more details, please visit our project page, paper, and GitHub repo.
Model variants
We provide two variants of CubeComposer in this repo:
- cubecomposer-3k: supports 2K/3K generation, cubemap size = 512/768, temporal window length = 9 frames.
- cubecomposer-4k: supports 4K generation, cubemap size = 960, temporal window length = 5 frames.
Citation
If you find our model helpful in your research, please like this repo, star the GitHub repo and cite:
@article{li2026cubecomposer,
title={CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video},
author={Li, Lingen and Wang, Guangzhi and Li, Xiaoyu and Zhang, Zhaoyang and Dou, Qi and Gu, Jinwei and Xue, Tianfan and Shan, Ying},
journal={arXiv preprint arXiv:2603.04291},
year={2026}
}
License
This repository is released under the terms of the LICENSE file.
By cloning, downloading, using, or distributing this repository or any of its models or weights, you agree to comply with the terms and conditions specified in the LICENSE.
- Downloads last month
- -
Model tree for TencentARC/CubeComposer
Base model
Wan-AI/Wan2.2-TI2V-5B