view article Article Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques Mar 24, 2025 β’ 20
view article Article Universal Assisted Generation: Faster Decoding with Any Assistant Model +6 Oct 29, 2024 β’ 60
view article Article Accelerate StarCoder with π€ Optimum Intel on Xeon: Q8/Q4 and Speculative Decoding +9 Jan 30, 2024 β’ 9
view article Article Accelerate StarCoder with π€ Optimum Intel on Xeon: Q8/Q4 and Speculative Decoding +9 Jan 30, 2024 β’ 9