benqi

Aidabenk

AI & ML interests

NLP

Recent Activity

upvoted a paper 4 days ago

EvasionBench: Detecting Evasive Answers in Financial Q&A via Multi-Model Consensus and LLM-as-Judge

upvoted a paper 4 days ago

Controlled Self-Evolution for Algorithmic Code Optimization

upvoted a paper 7 days ago

Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

View all activity

Organizations

None yet

upvoted 2 papers 4 days ago

EvasionBench: Detecting Evasive Answers in Financial Q&A via Multi-Model Consensus and LLM-as-Judge

Paper • 2601.09142 • Published 6 days ago • 8

Controlled Self-Evolution for Algorithmic Code Optimization

Paper • 2601.07348 • Published 7 days ago • 108

upvoted a paper 7 days ago

Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

Paper • 2601.06943 • Published 8 days ago • 202

liked a model 7 days ago

FutureMa/Eva-4B

Text Generation • 4B • Updated 2 days ago • 294 • 81

upvoted a paper 11 days ago

Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting

Paper • 2601.02151 • Published 14 days ago • 98

liked a model 12 days ago

FutureMa/Qwen3-4B-Evasion

Text Classification • 4B • Updated 12 days ago • 318 • 51

liked a dataset 22 days ago

bigai/TongSIM-Asset

Updated 21 days ago • 20.5k • 276

upvoted a paper 27 days ago

Region-Constraint In-Context Generation for Instructional Video Editing

Paper • 2512.17650 • Published Dec 19, 2025 • 50

upvoted a paper about 1 month ago

Finch: Benchmarking Finance & Accounting across Spreadsheet-Centric Enterprise Workflows

Paper • 2512.13168 • Published Dec 15, 2025 • 49

upvoted 2 papers about 2 months ago

The Consistency Critic: Correcting Inconsistencies in Generated Images via Reference-Guided Attentive Alignment

Paper • 2511.20614 • Published Nov 25, 2025 • 37

SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation

Paper • 2511.19320 • Published Nov 24, 2025 • 42

upvoted 2 papers 3 months ago

UniREditBench: A Unified Reasoning-based Image Editing Benchmark

Paper • 2511.01295 • Published Nov 3, 2025 • 38

TrajSelector: Harnessing Latent Representations for Efficient and Effective Best-of-N in Large Reasoning Model

Paper • 2510.16449 • Published Oct 18, 2025 • 34

upvoted a paper 4 months ago

LongCodeZip: Compress Long Context for Code Language Models

Paper • 2510.00446 • Published Oct 1, 2025 • 107

liked a dataset 4 months ago

tencent/WildSpeech-Bench

Viewer • Updated Sep 29, 2025 • 1.1k • 206 • 32

upvoted a paper 4 months ago

RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation

Paper • 2509.16198 • Published Sep 19, 2025 • 126

liked a model 4 months ago

MachineLearningLM/MachineLearningLM-7B-v1

Text Generation • 8B • Updated Oct 1, 2025 • 37 • 34

upvoted a paper 8 months ago

Table-R1: Inference-Time Scaling for Table Reasoning

Paper • 2505.23621 • Published May 29, 2025 • 93

upvoted 2 papers 9 months ago

DianJin-R1: Evaluating and Enhancing Financial Reasoning in Large Language Models

Paper • 2504.15716 • Published Apr 22, 2025 • 12

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17, 2025 • 93

benqi

AI & ML interests

Recent Activity

Organizations

Aidabenk's activity