Learning Video Context as Interleaved Multimodal Sequences Paper • 2407.21757 • Published Jul 31, 2024
Code2World: A GUI World Model via Renderable Code Generation Paper • 2602.09856 • Published 22 days ago • 197
Enhancing the Medical Context-Awareness Ability of LLMs via Multifaceted Self-Refinement Learning Paper • 2511.10067 • Published Nov 13, 2025
FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection Paper • 2601.03928 • Published Jan 7 • 17
ShowUI-$π$: Flow-based Generative Models as GUI Dexterous Hands Paper • 2512.24965 • Published Dec 31, 2025 • 42
ShowUI-π: Flow-based Generative Models as GUI Dexterous Hands Paper • 2512.24965 • Published Dec 31, 2025 • 42
Video Reality Test: Can AI-Generated ASMR Videos fool VLMs and Humans? Paper • 2512.13281 • Published Dec 15, 2025 • 64