A collection of benchmarks for evaluating LMs or VLMs under multi-turn interaction
Young-Jun Lee PRO
passing2961
AI & ML interests
Social Dialogue System, Multi-Modal Dialogue
Recent Activity
upvoted a paper about 13 hours ago
Beyond Language Modeling: An Exploration of Multimodal Pretraining upvoted a paper about 13 hours ago
Qwen3-Coder-Next Technical Report upvoted a paper about 13 hours ago
BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing?