Darwin-28B-Opus / .eval_results /gpqa_diamond.yaml
SeaWolf-AI's picture
Final release: Darwin-28B-Opus 88.89% GPQA Diamond (3-stage adaptive) + English README + eval_results + trade-secret removal
b0fe3a0 verified
raw
history blame contribute delete
244 Bytes
- dataset:
id: Idavidrein/gpqa
task_id: diamond
value: 88.89
date: "2026-04-25"
source:
url: https://huggingface.co/FINAL-Bench/Darwin-28B-Opus
name: Darwin-28B-Opus Benchmark (3-stage Adaptive Evaluation)
user: vidraft