nvidia/Frame_VAD_Multilingual_MarbleNet_v2.0 Voice Activity Detection • Updated May 14, 2025 • 15.5k • 35
VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control Paper • 2601.05138 • Published Jan 8 • 18
FFP-300K: Scaling First-Frame Propagation for Generalizable Video Editing Paper • 2601.01720 • Published Jan 5 • 6
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation Paper • 2504.14899 • Published Apr 21, 2025 • 20