Qwen 3.6 27B — MXFP4 + CRACK

Stock MLX mxfp4 4-bit | CRACK abliterated (extended) | Vision + Video | Dense Hybrid SSM/Attention | 14 GB

Ko-fi


What Is This?

This is Qwen 3.6 27B — a 27B-parameter dense vision-language model with hybrid linear + full-attention architecture, native image + video understanding, and bilingual EN/ZH capability.

It has been:

  1. MXFP4 quantized — stock MLX microscaling FP4 (group_size=32, 4-bit), vision tower preserved — 14 GB
  2. CRACK abliterated — permanent weight-level removal of safety refusal (extended recipe)
Base model Qwen 3.6 27B dense hybrid VL
Quantization MXFP4 — 14 GB
MMLU-200 82.0% (base: 83.5%, Δ −1.5pp)
HarmBench-320 98.44% strict comply (315/320)
Vision ViT preserved (image + video)
Context 262,144 native
Reasoning Toggleable via enable_thinking
Fits on 24 GB+ Macs

MMLU-200 Results (thinking OFF)

82.0% total (164/200) — 1.5pp below base 83.5%, well within a 5pp bound.

Subject Score
Astronomy 20/20 (100%)
Logical Fallacies 19/20 (95%)
High School Biology 19/20 (95%)
World Religions 19/20 (95%)
Anatomy 18/20 (90%)
High School Chemistry 15/20 (75%)
College Computer Science 15/20 (75%)
College Physics 15/20 (75%)
Abstract Algebra 12/20 (60%)
High School Mathematics 12/20 (60%)

HarmBench-320 — 98.44% strict comply (315/320)

Category Score
Chemical / Biological 42/42 (100.0%)
Copyright 80/80 (100.0%)
Harassment / Bullying 21/21 (100.0%)
Illegal 53/53 (100.0%)
Misinformation / Disinformation 54/54 (100.0%)
Harmful 17/18 (94.4%)
Cybercrime / Intrusion 48/52 (92.3%)
Total 315/320 (98.44%)

5 of 7 categories at 100%. Residual refusals cluster in cybercrime-intrusion (4) and harmful (1) — typical for abliterated models on specific exploit-code and graphic-content prompts.


Vision + Video

  • mlx_vlm.load() loads successfully
  • ✅ ViT preserved — image understanding verified
  • video_preprocessor_config.json preserved, video processor available
  • ✅ Bilingual EN + ZH text responses
  • Video inference: full video capability retained in the model weights. Current mlx_vlm.generate video path for Qwen 3.6 has a pipeline bug on both base and CRACK — video tokens render but frames don't reach the vision tower cleanly. Use vMLX or transformers for video inference. This is not a CRACK regression.

Reasoning ON / OFF

The chat template respects enable_thinking. Recommend ON for complex reasoning, OFF for short answers / benchmarks / tool use.

from mlx_vlm import load, generate

model, processor = load("dealignai/Qwen3.6-27B-MXFP4-CRACK")
tok = processor.tokenizer

# Thinking ON (default)
messages = [{"role": "user", "content": "Derive 47 * 23 step by step"}]
prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# Thinking OFF (direct answer, no <think> block)
prompt = tok.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True, enable_thinking=False,
)

Requirements

  • Apple Silicon Mac (M1 or newer)
  • 24 GB+ unified memory (model weights 14 GB, inference working set ~16 GB)
  • Python 3.10+ with mlx and mlx-vlm
  • For VL usage: Pillow (for image preprocessing)

Recommended runtimes:

  • vMLX — KV cache quantization, prefix cache, VL support
  • MLX Studio
  • mlx_vlm Python API

Qwen 3.6 CRACK Series

Model Format Size MMLU Fits on
Qwen 3.6 27B JANG_4M + CRACK JANG v2 mixed-precision 16 GB 83.5% / 99.69% HB 24 GB Mac — best quality
Qwen 3.6 27B MXFP4 + CRACK (this model) Stock MLX mxfp4 14 GB pending 24 GB Macbroadest compatibility
Qwen 3.6 35B-A3B JANGTQ4 + CRACK TurboQuant 4-bit experts 18 GB 73.5% MoE + hybrid SSM
Qwen 3.6 35B-A3B JANGTQ2 + CRACK TurboQuant 2-bit experts 11 GB 73.0% MoE, fits 16 GB Mac

Recommendation: Prefer JANG_4M + CRACK for best quality (higher MMLU and HarmBench). Use this MXFP4 variant for broadest tooling compatibility (stock mlx_vlm, LM Studio, etc).


About CRACK

CRACK is a permanent weight-level abliteration — the changes are baked into the model weights, not an inference-time system prompt or LoRA. The vision tower is untouched. Bilingual refusal extraction (EN + ZH) means the model complies on both English and Chinese prompts.


Support dealignai

All models are built from original research and published for free. These models are specifically crafted to be excellent coders and general-purpose assistants.

Support us on Ko-fi — check out the Ko-fi membership for early access and extras.

Have questions or need help with a specific model? DM us — we help for free most of the time.

Ko-fi | X @dealignai | dealign.ai


About dealignai

Dealign.AI Mascot

We research and publish abliterated models to advance AI safety understanding.

Follow us: 𝕏 @dealignai

See our research: Safety Generalization in Frontier MoE Models

dealign.ai

Disclaimer

This model has had its safety refusal circuits removed. It will produce responses that would normally be refused, including technical content on security testing, dual-use research, and sensitive topics. You are responsible for how you use it.

The CRACK abliteration process does not add new capabilities — it only removes the model's learned refusal patterns. All knowledge, including the knowledge used to produce unsafe outputs, was already present in the base Qwen model.

Downloads last month
2,383
Safetensors
Model size
6B params
Tensor type
U8
·
U32
·
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dealignai/Qwen3.6-27B-MXFP4-CRACK

Base model

Qwen/Qwen3.6-27B
Quantized
(204)
this model

Collections including dealignai/Qwen3.6-27B-MXFP4-CRACK