Endless Terminals: Scaling RL Environments for Terminal Agents Paper • 2601.16443 • Published 7 days ago • 15
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs Paper • 2601.08763 • Published 17 days ago • 143
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning Paper • 2601.09667 • Published 16 days ago • 84
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge Paper • 2601.08808 • Published 17 days ago • 38
view article Article Introducing OptiMind, a research model designed for optimization 15 days ago • 32
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv +8 Oct 23, 2025 • 148
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated May 5, 2025 • 301
mradermacher/Llama3.3-8B-Instruct-Thinking-Heretic-Uncensored-Claude-4.5-Opus-High-Reasoning-i1-GGUF 8B • Updated 28 days ago • 20.1k • 9