177228927977544653

This is a decensored version of TeichAI/Qwen3.5-27B-Claude-Opus-4.6-Distill, made using Heretic v1.2.0

Abliteration parameters

Parameter Value
direction_index 51.21
attn.o_proj.max_weight 1.41
attn.o_proj.max_weight_position 41.72
attn.o_proj.min_weight 1.09
attn.o_proj.min_weight_distance 36.93
mlp.down_proj.max_weight 1.47
mlp.down_proj.max_weight_position 38.63
mlp.down_proj.min_weight 1.42
mlp.down_proj.min_weight_distance 36.17

Performance

Metric This model Original model (TeichAI/Qwen3.5-27B-Claude-Opus-4.6-Distill)
KL divergence 0.0079 0 (by definition)
Refusals 21/100 92/100

Qwen3.5 27B x Claude Opus 4.6

Big thanks to @EclipseMist for providing the LoRAs for this model

Happy to share that this is one of the best models on this account for all around use (including agentic coding)

  • 🧬 Datasets:

    • crownelius/Opus-4.6-Reasoning-2100x-formatted
    • Personal Claude Data provided by @EclipseMist
  • 🏗 Base Model:

    • unsloth/Qwen3.5-27B
  • ⚡ Use cases:

    • Coding
    • Creative Writing
    • Visual Understanding
    • General Purpose

Citations and Contributions

  • @EclipseMist - Training and Data Curation
  • @crownelius - Data Curation
  • @unsloth - This qwen3 model was trained 2x faster with Unsloth and Huggingface's TRL library.
  • @Qwen - Providing a fantastic, native-multimodal base model

The following best practices recommended by Qwen

Best Practices

To achieve optimal performance, we recommend the following settings:

  1. Sampling Parameters:

    • We suggest using the following sets of sampling parameters depending on the mode and task type:
      • Thinking mode for general tasks:
        temperature=1.0, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0
      • Thinking mode for precise coding tasks (e.g., WebDev):
        temperature=0.6, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0
      • Instruct (or non-thinking) mode for general tasks:
        temperature=0.7, top_p=0.8, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0
      • Instruct (or non-thinking) mode for reasoning tasks:
        temperature=1.0, top_p=1.0, top_k=40, min_p=0.0, presence_penalty=2.0, repetition_penalty=1.0
    • For supported frameworks, you can adjust the presence_penalty parameter between 0 and 2 to reduce endless repetitions. However, using a higher value may occasionally result in language mixing and a slight decrease in model performance.
  2. Adequate Output Length: We recommend using an output length of 32,768 tokens for most queries. For benchmarking on highly complex problems, such as those found in math and programming competitions, we suggest setting the max output length to 81,920 tokens. This provides the model with sufficient space to generate detailed and comprehensive responses, thereby enhancing its overall performance.

  3. Standardize Output Format: We recommend using prompts to standardize model outputs when benchmarking.

    • Math Problems: Include "Please reason step by step, and put your final answer within \boxed{}." in the prompt.
    • Multiple-Choice Questions: Add the following JSON structure to the prompt to standardize responses: "Please show your choice in the answer field with only the choice letter, e.g., "answer": "C"."
  4. No Thinking Content in History: In multi-turn conversations, the historical model output should only include the final output part and does not need to include the thinking content. It is implemented in the provided chat template in Jinja2. However, for frameworks that do not directly use the Jinja2 chat template, it is up to the developers to ensure that the best practice is followed.

  5. Long Video Understanding: To optimize inference efficiency for plain text and images, the size parameter in the released video_preprocessor_config.json is conservatively configured. It is recommended to set the longest_edge parameter in the video_preprocessor_config file to 469,762,048 (corresponding to 224k video tokens) to enable higher frame-rate sampling for hour-scale videos and thereby achieve superior performance. For example,

    {"longest_edge": 469762048, "shortest_edge": 4096}
    

    Alternatively, override the default values via engine startup parameters. For implementation details, refer to: vLLM / SGLang.

Downloads last month
51
Safetensors
Model size
27B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for darkc0de/XORTRON.CriminalComputing.Q35xC46

Base model

Qwen/Qwen3.5-27B
Finetuned
(7)
this model

Dataset used to train darkc0de/XORTRON.CriminalComputing.Q35xC46