YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)
T-Bench Qwen SFT Multi-Task NAT v11
Model Description
This is a Qwen3-8B model fine-tuned on terminal bench tasks using Negative-Aware Training (NAT) v11. This model represents the latest iteration in the NAT series with enhanced negative example strategies.
Training Details
- Base Model: Qwen/Qwen3-8B
- Training Method: Negative-Aware Training (NAT) v11
- Epochs: 300
- Learning Rate: 5e-5
- Max Length: 32768 tokens
- Batch Size: 4 (2 per GPU with data parallelism)
- Attention: FlashAttention 2
- Precision: bfloat16
Dataset Composition
The training dataset includes a balanced mix of positive and negative examples:
- Positive Examples: Successful terminal command executions
- Negative Examples: Various failure patterns and common mistakes
- Context Length: Extended to 32768 tokens for longer terminal sessions
NAT v11 Improvements
This version builds upon previous NAT iterations with:
- Enhanced negative example generation
- Better coverage of edge cases
- Improved system prompts
- More diverse failure patterns
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"Aznaur/tbench-qwen-sft-multitask-nat-v11",
trust_remote_code=True,
torch_dtype=torch.bfloat16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
"Aznaur/tbench-qwen-sft-multitask-nat-v11",
trust_remote_code=True
)
Model Features
- Context Length: 32768 tokens
- Memory Efficient: Uses FlashAttention 2 and gradient checkpointing
- Negative-Aware: Trained to avoid common failure patterns
- Long Context: Supports extended terminal sessions
Hardware Requirements
- GPU Memory: ~16GB minimum (model is ~16GB with bfloat16)
- Recommended: A100 40GB+ for optimal performance
Training Pipeline
- Dataset Creation: Multi-task NAT v11 pipeline
- Training Config: Optimized for 2xA100 with data parallelism
- Negative Examples: Enhanced coverage of failure patterns
License
This model inherits the license from the base Qwen3-8B model.
- Downloads last month
- 2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support