Xiangyi Li's picture

Xiangyi Li

xdotli

·

https://www.xiangyi.li

AI & ML interests

None yet

Recent Activity

upvoted a paper 11 days ago

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

upvoted a paper 16 days ago

StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?

submitted a paper 16 days ago

SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

View all activity

Organizations

upvoted a paper 11 days ago

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

Paper • 2601.11868 • Published Jan 17 • 34

upvoted a paper 16 days ago

StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?

Paper • 2510.02209 • Published Oct 2, 2025 • 57

upvoted a collection 17 days ago

SkillsBench

1 item • Updated 17 days ago • 1

upvoted a paper 17 days ago

SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

Paper • 2602.12670 • Published 21 days ago • 54

upvoted 2 papers 12 months ago

Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong

Paper • 2501.09775 • Published Jan 16, 2025 • 32

HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs

Paper • 2503.02003 • Published Mar 3, 2025 • 48