Aznaur commited on
Commit
b6c03a6
·
verified ·
1 Parent(s): 13e403f

Add README

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # T-Bench Qwen SFT Multi-Task NAT v11
2
+
3
+ ## Model Description
4
+ This is a Qwen3-8B model fine-tuned on terminal bench tasks using Negative-Aware Training (NAT) v11. This model represents the latest iteration in the NAT series with enhanced negative example strategies.
5
+
6
+ ## Training Details
7
+ - **Base Model**: Qwen/Qwen3-8B
8
+ - **Training Method**: Negative-Aware Training (NAT) v11
9
+ - **Epochs**: 300
10
+ - **Learning Rate**: 5e-5
11
+ - **Max Length**: 32768 tokens
12
+ - **Batch Size**: 4 (2 per GPU with data parallelism)
13
+ - **Attention**: FlashAttention 2
14
+ - **Precision**: bfloat16
15
+
16
+ ## Dataset Composition
17
+ The training dataset includes a balanced mix of positive and negative examples:
18
+ - **Positive Examples**: Successful terminal command executions
19
+ - **Negative Examples**: Various failure patterns and common mistakes
20
+ - **Context Length**: Extended to 32768 tokens for longer terminal sessions
21
+
22
+ ## NAT v11 Improvements
23
+ This version builds upon previous NAT iterations with:
24
+ - Enhanced negative example generation
25
+ - Better coverage of edge cases
26
+ - Improved system prompts
27
+ - More diverse failure patterns
28
+
29
+ ## Usage
30
+ ```python
31
+ from transformers import AutoModelForCausalLM, AutoTokenizer
32
+
33
+ model = AutoModelForCausalLM.from_pretrained(
34
+ "Aznaur/tbench-qwen-sft-multitask-nat-v11",
35
+ trust_remote_code=True,
36
+ torch_dtype=torch.bfloat16,
37
+ device_map="auto"
38
+ )
39
+ tokenizer = AutoTokenizer.from_pretrained(
40
+ "Aznaur/tbench-qwen-sft-multitask-nat-v11",
41
+ trust_remote_code=True
42
+ )
43
+ ```
44
+
45
+ ## Model Features
46
+ - **Context Length**: 32768 tokens
47
+ - **Memory Efficient**: Uses FlashAttention 2 and gradient checkpointing
48
+ - **Negative-Aware**: Trained to avoid common failure patterns
49
+ - **Long Context**: Supports extended terminal sessions
50
+
51
+ ## Hardware Requirements
52
+ - GPU Memory: ~16GB minimum (model is ~16GB with bfloat16)
53
+ - Recommended: A100 40GB+ for optimal performance
54
+
55
+ ## Training Pipeline
56
+ - **Dataset Creation**: Multi-task NAT v11 pipeline
57
+ - **Training Config**: Optimized for 2xA100 with data parallelism
58
+ - **Negative Examples**: Enhanced coverage of failure patterns
59
+
60
+ ## License
61
+ This model inherits the license from the base Qwen3-8B model.