Seungyeop Yi's picture

2

Seungyeop Yi

devpotatopotato

·

devpotatopotato

AI & ML interests

None yet

Recent Activity

authored a paper 5 days ago

CostNav: A Navigation Benchmark for Real-World Economic-Cost Evaluation of Physical AI Agents

authored a paper 5 days ago

Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs

upvoted a paper 7 days ago

Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs

View all activity

Organizations

authored 2 papers 5 days ago

CostNav: A Navigation Benchmark for Real-World Economic-Cost Evaluation of Physical AI Agents

Paper • 2511.20216 • Published Nov 25, 2025

Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs

Paper • 2605.09063 • Published 10 days ago • 77

upvoted a paper 7 days ago

Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs

Paper • 2605.09063 • Published 10 days ago • 77

updated a model about 1 month ago

pi-research/qwen3-8b-op-tr-20260326

8B • Updated Apr 5 • 2

published a model about 2 months ago

pi-research/qwen3-8b-op-tr-20260326

8B • Updated Apr 5 • 2

upvoted a paper 2 months ago

Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math

Paper • 2602.06291 • Published Feb 6 • 24