YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Terminal Agent - Multi-Task NAT v13

Model Description

This model is fine-tuned from Qwen3-8B on multi-task terminal agent trajectories using Negative-Aware Training (NAT).

Key Features

  • 5 Tasks: fix-git, cancel-async-tasks, log-summary-date-ranges, regex-log, pypi-server
  • Fixed Tool Signatures: Corrected critical bug where note_name was incorrectly removed
  • Clean Tool Calls: Removed hallucinated parameters (message_title, message_description, message_attachment) from training
  • Negative Examples: Includes looping and wrong_command negative examples

Training Details

  • Base Model: Qwen/Qwen3-8B
  • Training Data: 40 samples (20 positive, 20 negative)
  • Epochs: 99/300 (checkpoint at epoch 99)
  • Learning Rate: 5e-5
  • Batch Size: 4
  • Global Step: 899

Evaluation Results

  • Overall Success Rate: 56% (14/25 trials)
  • fix-git: 4/5 (80%)
  • regex-log: 4/5 (80%)
  • cancel-async-tasks: 3/5 (60%)
  • log-summary-date-ranges: 3/5 (60%)
  • pypi-server: 0/5 (0%)

Tool Signatures (Corrected)

  • shell_exec(id, command, block)
  • shell_write_content_to_file(content, file_path)
  • create_note(note_name, content)
  • append_note(note_name, content)
  • read_note(note_name)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "alievak/terminal_agent_multitask_nat_v13",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
    "alievak/terminal_agent_multitask_nat_v13"
)

# Example usage
messages = [
    {"role": "system", "content": "You are a terminal agent..."},
    {"role": "user", "content": "Fix the git repository..."}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=2048)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

V13 Fixes

  1. KEEP note_name - Required by runtime (was incorrectly removed in v12)
  2. System prompt uses note_name - Matches runtime expectations
  3. Remove only hallucinated params - message_title, message_description, message_attachment
  4. Added tool call validation - Catches signature issues before training

Known Issues

  • Model still outputs hallucinated parameters (412 occurrences in eval) - embedded in base model
  • pypi-server fails due to missing mkdir before write
  • Action sequence divergence from teacher trajectories

Model Size

  • Total: ~16GB (4 safetensors)
  • Architecture: 8.2B parameters

License

MIT License

Citation

If you use this model, please cite:

@misc{terminal_agent_v13,
  title={Terminal Agent Multi-Task NAT v13},
  author={alievak},
  year={2026},
  url={https://huggingface.co/alievak/terminal_agent_multitask_nat_v13}
}
Downloads last month
-
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support