microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated Dec 10, 2025 • 335k • 1.58k
ibm-granite/granite-vision-3.2-2b Image-Text-to-Text • 3B • Updated Jun 12, 2025 • 10.4k • 121
nomic-ai/colnomic-embed-multimodal-7b Visual Document Retrieval • Updated Apr 15, 2025 • 5.01k • 101
Running 197 Vidore Leaderboard 🥇 197 Compare and rank visual document retrieval models across different benchmarks
nomic-ai/nomic-embed-multimodal-3b Visual Document Retrieval • Updated Apr 15, 2025 • 1.76k • 29
moondream/moondream-2b-2025-04-14-4bit Image-Text-to-Text • 1B • Updated May 22, 2025 • 5.24k • 63
Alibaba-NLP/gme-Qwen2-VL-2B-Instruct Sentence Similarity • 2B • Updated Jun 9, 2025 • 59.5k • 133