πŸ€– Open Computer Agent v2.0

An enhanced universal computer-use agent built on smolagents, E2B Desktop, and Playwright.

What's New in v2.0

Feature Description
🧠 Hierarchical Planner Breaks goals into subtasks before execution using a cheap text model
πŸ”Œ Playwright MCP Semantic browser control (click by text/role, extract tables/links, evaluate JS)
🎯 Multi-Model Router Auto-selects the cheapest capable model (fast vision ↔ powerful vision ↔ fast text ↔ powerful text)
🧩 Set-of-Marks Vision Overlays numbered bounding boxes on UI elements for coordinate-free interaction
πŸ—„οΈ Long-Term Memory ChromaDB vector store retrieves similar past tasks and proven strategies
πŸ” Verifier Agent Checks subtask completion and triggers recovery loops
πŸ›‘ Human-in-the-Loop Pauses on sensitive actions (payments, emails, deletes) for user approval
πŸŽ™οΈ Voice I/O Speak tasks and hear responses via Whisper STT + Kokoro TTS
πŸ’° Cost Dashboard Real-time $/task, token usage, and latency tracking
πŸ“Ή Session Recording Saves every step as replayable macros with GIF/MP4 export potential
πŸ§ͺ Enhanced Eval Built-in benchmark suite with LLM-as-a-Judge grading and A/B testing

Architecture

User Input (Text / Voice / File)
       |
       v
[Intelligence Router] ----> Planner (JSON DAG)
       |
       v
[Memory Retrieval] (ChromaDB)
       |
       v
[Plan Executor]
       |
       +---> [Browser Sub-Agent] (Playwright MCP)
       +---> [Desktop Sub-Agent] (E2B + SoM Vision)
       +---> [Coder Sub-Agent] (Code Interpreter)
       +---> [HF Hub Sub-Agent] (Search / Upload)
       |
       v
[Verifier] -> Retry / Alternative / Continue
       |
       v
[Macro Saver] + Cost Report + Session Recording

Quick Start

  1. Set your HF_TOKEN and E2B_API_KEY in the Space Secrets.
  2. Type a task (or speak it) and hit πŸš€ Let's go!.
  3. Watch the agent plan, execute, verify, and report costs.

Sensitive Actions

By default, the agent pauses before:

  • Payments, purchases, subscriptions
  • Sending emails/messages/posts
  • Deleting files or uninstalling software
  • Password/credit-card fields

Enable Auto-approve all actions in Advanced Options to disable HITL.

Cost Budget

Default budget is $2.00 USD per session. The router automatically downgrades to cheaper models as the budget is consumed.

Benchmarks

Run the built-in eval suite:

from eval_harness import EvaluationHarness
# See eval_harness.py for usage

Credits

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support