# Terminal Agent - Multi-Task NAT v13

## Model Description
This model is fine-tuned from Qwen3-8B on multi-task terminal agent trajectories using Negative-Aware Training (NAT).

### Key Features
- **5 Tasks**: fix-git, cancel-async-tasks, log-summary-date-ranges, regex-log, pypi-server
- **Fixed Tool Signatures**: Corrected critical bug where `note_name` was incorrectly removed
- **Clean Tool Calls**: Removed hallucinated parameters (message_title, message_description, message_attachment) from training
- **Negative Examples**: Includes looping and wrong_command negative examples

### Training Details
- **Base Model**: Qwen/Qwen3-8B
- **Training Data**: 40 samples (20 positive, 20 negative)
- **Epochs**: 99/300 (checkpoint at epoch 99)
- **Learning Rate**: 5e-5
- **Batch Size**: 4
- **Global Step**: 899

### Evaluation Results
- **Overall Success Rate**: 56% (14/25 trials)
- **fix-git**: 4/5 (80%)
- **regex-log**: 4/5 (80%)
- **cancel-async-tasks**: 3/5 (60%)
- **log-summary-date-ranges**: 3/5 (60%)
- **pypi-server**: 0/5 (0%)

### Tool Signatures (Corrected)
- `shell_exec(id, command, block)`
- `shell_write_content_to_file(content, file_path)`
- `create_note(note_name, content)`
- `append_note(note_name, content)`
- `read_note(note_name)`

### Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "alievak/terminal_agent_multitask_nat_v13",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
    "alievak/terminal_agent_multitask_nat_v13"
)

# Example usage
messages = [
    {"role": "system", "content": "You are a terminal agent..."},
    {"role": "user", "content": "Fix the git repository..."}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=2048)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
```

### V13 Fixes
1. **KEEP note_name** - Required by runtime (was incorrectly removed in v12)
2. **System prompt uses note_name** - Matches runtime expectations
3. **Remove only hallucinated params** - message_title, message_description, message_attachment
4. **Added tool call validation** - Catches signature issues before training

### Known Issues
- Model still outputs hallucinated parameters (412 occurrences in eval) - embedded in base model
- pypi-server fails due to missing mkdir before write
- Action sequence divergence from teacher trajectories

### Model Size
- **Total**: ~16GB (4 safetensors)
- **Architecture**: 8.2B parameters

## License
MIT License

## Citation
If you use this model, please cite:
```bibtex
@misc{terminal_agent_v13,
  title={Terminal Agent Multi-Task NAT v13},
  author={alievak},
  year={2026},
  url={https://huggingface.co/alievak/terminal_agent_multitask_nat_v13}
}
```