Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 10 days ago • 117
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published Apr 8 • 121
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published Apr 8 • 325
arithmetic-circuit-overloading/Llama-3.3-70B-Instruct-v2-3d-4M-400K-0.1-reverse-padzero-99-512D-1L-2H-2048I Text Generation • 3.69M • Updated Apr 5 • 40 • 1
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published Mar 30 • 342