Vista4D: Video Reshooting with 4D Point Clouds (CVPR 2026 Highlight) – Model Checkpoints

Kuan Heng Lin^1,3∗, Zhizheng Liu^1,4∗, Pablo Salamanca^1,2, Yash Kant^1,2, Ryan Burgert^1,2,5∗, Yuancheng Xu^1,2, Koichi Namekata^1,2,6∗, Yiwei Zhao², Bolei Zhou⁴, Micah Goldblum³, Paul Debevec^1,2, Ning Yu^1,2
¹Eyeline Labs, ²Netflix, ³Columbia University, ⁴UCLA, ⁵Stony Brook University, ⁶University of Oxford

^∗Work done during an internship at Eyeline Labs

Vista4D is a video reshooting framework which synthesizes the dynamic scene represented by an input source video from novel camera trajectories and viewpoints. We bridge the distribution shift between training and inference for point-cloud-grounded video reshooting, as Vista4D is robust to point cloud artifacts from imprecise 4D reconstruction of real-world videos by training on noisy, reconstructed multiview videos. Our 4D point cloud with temporally-persistent static points also explicitly preserves scene content and improved camera control. Vista4D generalizes to real-world applications such as dynamic scene expansion (casual video capture of scene as background reference), 4D scene recomposition (point cloud editing), and long video inference with memory.

This is the Hugging Face repository containing our model weights. We provide two Vista4D checkpoints finetuned on Wan-AI/Wan2.1-T2V-14B(https://huggingface.co/Wan-AI/Wan2.1-T2V-14B):

Checkpoint	Base model	Training resolution	Training steps	Notes
`384p49_step=30000`	`Wan2.1-T2V-14B`	672 × 384, 49 frames	30000	N/A
`720p49_step=3000`	`Wan2.1-T2V-14B`	1280 × 720, 49 frames	3000	Finetuned from `384p49_step=30000`

To do Vista4D inference, first download the Wan 2.1 and Vista4D checkpoints to ./checkpoints/. The Vista4D checkpoints are hosted on Eyeline-Labs/Vista4D. Download both the 384p and 720p checkpoints into ./checkpoints/vista4d/ with

hf download Eyeline-Labs/Vista4D --local-dir ./checkpoints/vista4d

If you only need one resolution, pass --include to grab just that variant with

hf download Eyeline-Labs/Vista4D --local-dir ./checkpoints/vista4d --include "384p49_step=30000/*" OR "720p49_step=3000/*"

You'll also need the Wan2.1-T2V-14B base model. Download it from Wan-AI/Wan2.1-T2V-14B into ./checkpoints/wan/Wan2.1-T2V-14B/ with

hf download Wan-AI/Wan2.1-T2V-14B --local-dir ./checkpoints/wan/Wan2.1-T2V-14B

Instructions on how to use these weights, more results, and paper can be found on our project page and GitHub repository.

Downloads last month: 44

Inference Providers NEW

Video-to-Video

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Eyeline-Labs/Vista4D

Base model

Wan-AI/Wan2.1-T2V-14B

Finetuned

(56)

this model

Datasets used to train Eyeline-Labs/Vista4D

Paper for Eyeline-Labs/Vista4D

Vista4D: Video Reshooting with 4D Point Clouds

Paper • 2604.21915 • Published 6 days ago • 11