Radamanthys11
/

Qwen3.5-27B-MTP-Q8_0-GGUF

Model card Files Files and versions

Qwen3.5-27B MTP — Q8_0 GGUF

This is a Q8_0 GGUF quantization of Qwen/Qwen3.5-27B with the MTP (Multi-Token Prediction) layer preserved, unlike most publicly available GGUFs which strip it out.

Do not use with llama.cpp as they do not offer support, this model is only compatible with ik_llama.cpp.

Downloads last month: 28

GGUF

Model size

27B params

Architecture

qwen35

Hardware compatibility

Log In to add your hardware

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Radamanthys11/Qwen3.5-27B-MTP-Q8_0-GGUF

Base model

Qwen/Qwen3.5-27B

Quantized

(205)

this model