Qwen3.5-9B-Claude-4.7 Reasoning Distill

Model Description

This model is a fine-tuned version of Qwen/Qwen3.5-9B, specifically optimized for enhanced reasoning and step-by-step logical deduction. It was trained using Unsloth via distillation from high-quality reasoning chains produced by Claude 3 Opus.

The goal of this fine-tune is to bring "large-model" reasoning capabilities to a compact, 9-billion parameter efficiency, allowing for complex problem solving on consumer-grade hardware.

Training Details

The model was trained using a LoRA (Low-Rank Adaptation) approach, targeting all linear modules to ensure maximum knowledge retention from the distillation source.

  • Epochs: 2
  • Total Steps: 1,016
  • Batch Size: 16 (Total)
  • Learning Rate: 2e-4 (Linear Decay)
  • Final Loss: ~1.1074
  • Precision: 4-bit QLoRA (Bfloat16 internal)

Training Tooling

This model was trained 2x faster with Unsloth and Huggingface's TRL library.

Usage

The model uses the standard ChatML prompt format. For best results, ask the model to "Think step by step" or provide a logic-heavy prompt.

Prompt Template (ChatML)

<|im_start|>system
You are a helpful assistant with advanced reasoning capabilities.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
<|im_thought|>

How to use

You can load this model using the transformers library or via Unsloth for optimized inference:

from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "keypa/Qwen3.5-9B-Claude-4.7",
    max_seq_length = 2048,
    load_in_4bit = True,
)
FastLanguageModel.for_inference(model)
Downloads last month
76
Safetensors
Model size
10B params
Tensor type
BF16
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for keypa/Qwen3.5-9B-Claude-4.7

Finetuned
Qwen/Qwen3.5-9B
Finetuned
(207)
this model
Quantizations
1 model

Dataset used to train keypa/Qwen3.5-9B-Claude-4.7