This is a decensored version of TeichAI/Qwen3.5-27B-Claude-Opus-4.6-Distill, made using Heretic v1.2.0

Abliteration parameters

Parameter	Value
direction_index	51.21
attn.o_proj.max_weight	1.41
attn.o_proj.max_weight_position	41.72
attn.o_proj.min_weight	1.09
attn.o_proj.min_weight_distance	36.93
mlp.down_proj.max_weight	1.47
mlp.down_proj.max_weight_position	38.63
mlp.down_proj.min_weight	1.42
mlp.down_proj.min_weight_distance	36.17

Performance

Metric	This model	Original model (TeichAI/Qwen3.5-27B-Claude-Opus-4.6-Distill)
KL divergence	0.0079	0 (by definition)
Refusals	21/100	92/100

Qwen3.5 27B x Claude Opus 4.6

Big thanks to @EclipseMist for providing the LoRAs for this model

Happy to share that this is one of the best models on this account for all around use (including agentic coding)

🧬 Datasets:
- crownelius/Opus-4.6-Reasoning-2100x-formatted
- Personal Claude Data provided by @EclipseMist
🏗 Base Model:
- unsloth/Qwen3.5-27B
⚡ Use cases:
- Coding
- Creative Writing
- Visual Understanding
- General Purpose

Citations and Contributions

@EclipseMist - Training and Data Curation
@crownelius - Data Curation
@unsloth - This qwen3 model was trained 2x faster with Unsloth and Huggingface's TRL library.
@Qwen - Providing a fantastic, native-multimodal base model

The following best practices recommended by Qwen

Best Practices

To achieve optimal performance, we recommend the following settings:

Sampling Parameters:
- We suggest using the following sets of sampling parameters depending on the mode and task type:
  - Thinking mode for general tasks:
    temperature=1.0, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0
  - Thinking mode for precise coding tasks (e.g., WebDev):
    temperature=0.6, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0
  - Instruct (or non-thinking) mode for general tasks:
    temperature=0.7, top_p=0.8, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0
  - Instruct (or non-thinking) mode for reasoning tasks:
    temperature=1.0, top_p=1.0, top_k=40, min_p=0.0, presence_penalty=2.0, repetition_penalty=1.0
- For supported frameworks, you can adjust the presence_penalty parameter between 0 and 2 to reduce endless repetitions. However, using a higher value may occasionally result in language mixing and a slight decrease in model performance.
Adequate Output Length: We recommend using an output length of 32,768 tokens for most queries. For benchmarking on highly complex problems, such as those found in math and programming competitions, we suggest setting the max output length to 81,920 tokens. This provides the model with sufficient space to generate detailed and comprehensive responses, thereby enhancing its overall performance.
Standardize Output Format: We recommend using prompts to standardize model outputs when benchmarking.
- Math Problems: Include "Please reason step by step, and put your final answer within \boxed{}." in the prompt.
- Multiple-Choice Questions: Add the following JSON structure to the prompt to standardize responses: "Please show your choice in the answer field with only the choice letter, e.g., "answer": "C"."
No Thinking Content in History: In multi-turn conversations, the historical model output should only include the final output part and does not need to include the thinking content. It is implemented in the provided chat template in Jinja2. However, for frameworks that do not directly use the Jinja2 chat template, it is up to the developers to ensure that the best practice is followed.
Long Video Understanding: To optimize inference efficiency for plain text and images, the size parameter in the released video_preprocessor_config.json is conservatively configured. It is recommended to set the longest_edge parameter in the video_preprocessor_config file to 469,762,048 (corresponding to 224k video tokens) to enable higher frame-rate sampling for hour-scale videos and thereby achieve superior performance. For example,
```
{"longest_edge": 469762048, "shortest_edge": 4096}
```
Alternatively, override the default values via engine startup parameters. For implementation details, refer to: vLLM / SGLang.

Downloads last month: 51

Safetensors

Model size

27B params

Tensor type

F16

Model tree for darkc0de/XORTRON.CriminalComputing.Q35xC46

Base model

Qwen/Qwen3.5-27B

Finetuned

unsloth/Qwen3.5-27B