Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding Paper β’ 2505.22618 β’ Published May 28, 2025 β’ 45
Taming LLMs by Scaling Learning Rates with Gradient Grouping Paper β’ 2506.01049 β’ Published Jun 1, 2025 β’ 38
More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models Paper β’ 2505.21523 β’ Published May 23, 2025 β’ 13
view article Article No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL +4 Jun 3, 2025 β’ 98
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time Paper β’ 2505.24863 β’ Published May 30, 2025 β’ 97
view article Article On Learning JAX β A Framework for High Performance Machine Learning Dec 3, 2023 β’ 3