AutoRound quant fails to load with mlx-lm
#1
by smcleod - opened
FYI - your mixed-bit MLX AutoRound quants of Qwen3.5/3.6 models including this repo currently fail to load with the latest stock mlx-lm 0.31.3 with a shape mismatch on the first non-default-bit tensor.
I believe the bug is upstream, not in your quants: qwen3_5.py:Model.sanitize remaps weight key prefixes (model.language_model.X → language_model.model.X) but not the matching keys in config["quantization"], so per-tensor bit-width overrides miss and the global default gets applied.
Filed at ml-explore/mlx-lm#1214, just a heads-up in case users start reporting load failures.
Thanks for the feedback. Please feel free to open an issue or PR if there’s anything AutoRound can help with.