Yingjie Lei
ChaceLei2004
·
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 1 hour ago
DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos
upvoted
a
paper
about 1 hour ago
On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language Models
upvoted
a
paper
3 days ago
RISE-Video: Can Video Generators Decode Implicit World Rules?
Organizations
None yet
Infra
3D Vision
-
Seed3D 1.0: From Images to High-Fidelity Simulation-Ready 3D Assets
Paper • 2510.19944 • Published • 20 -
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations
Paper • 2510.23607 • Published • 179 -
Masked Depth Modeling for Spatial Perception
Paper • 2601.17895 • Published • 24
Embodied AI
-
World-in-World: World Models in a Closed-Loop World
Paper • 2510.18135 • Published • 77 -
GigaBrain-0: A World Model-Powered Vision-Language-Action Model
Paper • 2510.19430 • Published • 51 -
World Simulation with Video Foundation Models for Physical AI
Paper • 2511.00062 • Published • 42 -
TwinBrainVLA: Unleashing the Potential of Generalist VLMs for Embodied Tasks via Asymmetric Mixture-of-Transformers
Paper • 2601.14133 • Published • 60
Video Generation
-
VISTA: A Test-Time Self-Improving Video Generation Agent
Paper • 2510.15831 • Published • 22 -
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives
Paper • 2510.20822 • Published • 41 -
Video-As-Prompt: Unified Semantic Control for Video Generation
Paper • 2510.20888 • Published • 50 -
The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation
Paper • 2601.17737 • Published • 55
Technical Reports
RL
Infra
Adaptation
3D Vision
-
Seed3D 1.0: From Images to High-Fidelity Simulation-Ready 3D Assets
Paper • 2510.19944 • Published • 20 -
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations
Paper • 2510.23607 • Published • 179 -
Masked Depth Modeling for Spatial Perception
Paper • 2601.17895 • Published • 24
Video Understanding
Embodied AI
-
World-in-World: World Models in a Closed-Loop World
Paper • 2510.18135 • Published • 77 -
GigaBrain-0: A World Model-Powered Vision-Language-Action Model
Paper • 2510.19430 • Published • 51 -
World Simulation with Video Foundation Models for Physical AI
Paper • 2511.00062 • Published • 42 -
TwinBrainVLA: Unleashing the Potential of Generalist VLMs for Embodied Tasks via Asymmetric Mixture-of-Transformers
Paper • 2601.14133 • Published • 60
World Models
Video Generation
-
VISTA: A Test-Time Self-Improving Video Generation Agent
Paper • 2510.15831 • Published • 22 -
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives
Paper • 2510.20822 • Published • 41 -
Video-As-Prompt: Unified Semantic Control for Video Generation
Paper • 2510.20888 • Published • 50 -
The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation
Paper • 2601.17737 • Published • 55