Laguna-XS.2 / .eval_results /swe-bench_pro.yaml
joerowell's picture
Add community evaluation results for SWE-BENCH_PRO, SWE-BENCH_VERIFIED, TERMINAL-BENCH-2.0 (#5)
06367f7
- dataset:
id: ScaleAI/SWE-bench_Pro
task_id: SWE_Bench_Pro
value: 44.5
source:
url: https://huggingface.co/poolside/Laguna-XS.2
name: Model Card