Aznaur's picture
Add README
b6c03a6 verified

T-Bench Qwen SFT Multi-Task NAT v11

Model Description

This is a Qwen3-8B model fine-tuned on terminal bench tasks using Negative-Aware Training (NAT) v11. This model represents the latest iteration in the NAT series with enhanced negative example strategies.

Training Details

  • Base Model: Qwen/Qwen3-8B
  • Training Method: Negative-Aware Training (NAT) v11
  • Epochs: 300
  • Learning Rate: 5e-5
  • Max Length: 32768 tokens
  • Batch Size: 4 (2 per GPU with data parallelism)
  • Attention: FlashAttention 2
  • Precision: bfloat16

Dataset Composition

The training dataset includes a balanced mix of positive and negative examples:

  • Positive Examples: Successful terminal command executions
  • Negative Examples: Various failure patterns and common mistakes
  • Context Length: Extended to 32768 tokens for longer terminal sessions

NAT v11 Improvements

This version builds upon previous NAT iterations with:

  • Enhanced negative example generation
  • Better coverage of edge cases
  • Improved system prompts
  • More diverse failure patterns

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Aznaur/tbench-qwen-sft-multitask-nat-v11",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
    "Aznaur/tbench-qwen-sft-multitask-nat-v11",
    trust_remote_code=True
)

Model Features

  • Context Length: 32768 tokens
  • Memory Efficient: Uses FlashAttention 2 and gradient checkpointing
  • Negative-Aware: Trained to avoid common failure patterns
  • Long Context: Supports extended terminal sessions

Hardware Requirements

  • GPU Memory: ~16GB minimum (model is ~16GB with bfloat16)
  • Recommended: A100 40GB+ for optimal performance

Training Pipeline

  • Dataset Creation: Multi-task NAT v11 pipeline
  • Training Config: Optimized for 2xA100 with data parallelism
  • Negative Examples: Enhanced coverage of failure patterns

License

This model inherits the license from the base Qwen3-8B model.