LLM-based Detection of Manipulative Political Narratives Paper • 2605.14354 • Published 13 days ago • 5
When to Trust Imagination: Adaptive Action Execution for World Action Models Paper • 2605.06222 • Published 20 days ago • 42
NanoResearch: Co-Evolving Skills, Memory, and Policy for Personalized Research Automation Paper • 2605.10813 • Published 16 days ago • 16
Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs Paper • 2605.09063 • Published 18 days ago • 78
SuperLocalMemory V3.3: The Living Brain -- Biologically-Inspired Forgetting, Cognitive Quantization, and Multi-Channel Retrieval for Zero-LLM Agent Memory Systems Paper • 2604.04514 • Published Apr 6 • 7
Map2World: Segment Map Conditioned Text to 3D World Generation Paper • 2605.00781 • Published 26 days ago • 25
GBQA: A Game Benchmark for Evaluating LLMs as Quality Assurance Engineers Paper • 2604.02648 • Published Apr 3 • 47
Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising Paper • 2604.26694 • Published 28 days ago • 6
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond Paper • 2604.22748 • Published Apr 24 • 227
CreativeGame:Toward Mechanic-Aware Creative Game Generation Paper • 2604.19926 • Published Apr 21 • 2
Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper • 2604.12374 • Published Apr 14 • 37
Toward Autonomous Long-Horizon Engineering for ML Research Paper • 2604.13018 • Published Apr 14 • 34
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents Paper • 2604.11784 • Published Apr 13 • 143
SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding Paper • 2604.09557 • Published Feb 10 • 11
Tracing the Roots: A Multi-Agent Framework for Uncovering Data Lineage in Post-Training LLMs Paper • 2604.10480 • Published Apr 12 • 20
On Semiotic-Grounded Interpretive Evaluation of Generative Art Paper • 2604.08641 • Published Apr 9 • 4