Qwen3.5-9B-Claude-4.7 Reasoning Distill

Model Description

This model is a fine-tuned version of Qwen/Qwen3.5-9B, specifically optimized for enhanced reasoning and step-by-step logical deduction. It was trained using Unsloth via distillation from high-quality reasoning chains produced by Claude 3 Opus.

The goal of this fine-tune is to bring "large-model" reasoning capabilities to a compact, 9-billion parameter efficiency, allowing for complex problem solving on consumer-grade hardware.

Developed by: keypa
Finetuned from model: Qwen/Qwen3.5-9B
Dataset: lordx64/reasoning-distill-claude-opus-4-7-max
Language: English
License: Apache-2.0

Training Details

The model was trained using a LoRA (Low-Rank Adaptation) approach, targeting all linear modules to ensure maximum knowledge retention from the distillation source.

Epochs: 2
Total Steps: 1,016
Batch Size: 16 (Total)
Learning Rate: 2e-4 (Linear Decay)
Final Loss: ~1.1074
Precision: 4-bit QLoRA (Bfloat16 internal)

Training Tooling

This model was trained 2x faster with Unsloth and Huggingface's TRL library.

Usage

The model uses the standard ChatML prompt format. For best results, ask the model to "Think step by step" or provide a logic-heavy prompt.

Prompt Template (ChatML)

<|im_start|>system
You are a helpful assistant with advanced reasoning capabilities.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
<|im_thought|>

How to use

You can load this model using the transformers library or via Unsloth for optimized inference:

from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "keypa/Qwen3.5-9B-Claude-4.7",
    max_seq_length = 2048,
    load_in_4bit = True,
)
FastLanguageModel.for_inference(model)

Downloads last month: 76

Safetensors

Model size

10B params

Tensor type

BF16

F32

Model tree for keypa/Qwen3.5-9B-Claude-4.7

Base model

Qwen/Qwen3.5-9B-Base

Finetuned

Qwen/Qwen3.5-9B

Finetuned

(207)

this model

Quantizations

1 model

keypa
/

Qwen3.5-9B-Claude-4.7