Melvin Vivas
AI & ML interests
Recent Activity
Organizations
- Running on ZeroAgentsFeatured921
Omni Video Factory
🏆921text to video, image to video, video extend
- Running on ZeroMCP2.31k
Wan2.2 14B Preview
🐌2.31kgenerate a video from an image with a text prompt
- Running on ZeroMCPFeatured2.93k
Wan2.2 14B Fast
🎥2.93kgenerate a video from an image with a text prompt
- Running on ZeroAgentsFeatured131
LTX 2.3 Sync
🕺131Portrait animation & lipsync with LTX 2.3
- Running1
Qwen-3-VL-8B OCR Receipts
🚀1structured data parser from receipt images
- RunningAgentsFeatured262
Qwen3 Omni Demo
⚡262Chat with multimodal AI using text, audio, images, and video
- Running on ZeroAgentsFeatured115
VLM Object Understanding
🦀115Explore object detection, visual grounding, keypoint Detecti
- SleepingAgents2
Dataset Card Drafter
😻2Create dataset descriptions and open PRs automatically
- Running on ZeroAgentsFeatured184
VibeVoice-Realtime-0.5B
🐨184Generate natural speech from text with selectable voices
-
microsoft/VibeVoice-1.5B
Text-to-Speech • 3B • Updated • 149k • 2.34k - RunningAgentsFeatured400
Qwen3 TTS Demo
🚀400Generate spoken audio from text in many voices and languages
-
mradermacher/Qwen3-1.7B-Multilingual-TTS-GGUF
2B • Updated • 4.35k • 10
- Running on CPU UpgradeAgents1.01k
Open VLM Leaderboard
🌎1.01kVLMEvalKit Evaluation Results Collection
- Running on ZeroAgentsFeatured465
DeepSeek OCR 2 Demo
🚀465Try out DeepSeek-OCR-2 on your PDFs or images
- Running on ZeroMCP68
Multimodal OCR3
🌖68Chandra-OCR / Nanonets-OCR2 / olmOCR-2 / Dots.OCR
-
Qwen/Qwen3-VL-30B-A3B-Instruct
Image-Text-to-Text • Updated • 1.13M • • 564
-
openai/whisper-large-v3
Automatic Speech Recognition • 2B • Updated • 4.87M • • 5.61k -
openai/whisper-large-v3-turbo
Automatic Speech Recognition • 0.8B • Updated • 6.88M • • 2.96k - Running on ZeroMCPFeatured829
Whisper Large V3
🤫829Transcribe or translate audio and YouTube videos to text
- Running on ZeroAgentsFeatured90
Kugel Audio
👀90Generate natural-sounding speech in European languages with voice cloning
- Running on ZeroAgentsFeatured921
Omni Video Factory
🏆921text to video, image to video, video extend
- Running on ZeroMCP2.31k
Wan2.2 14B Preview
🐌2.31kgenerate a video from an image with a text prompt
- Running on ZeroMCPFeatured2.93k
Wan2.2 14B Fast
🎥2.93kgenerate a video from an image with a text prompt
- Running on ZeroAgentsFeatured131
LTX 2.3 Sync
🕺131Portrait animation & lipsync with LTX 2.3
- Running1
Qwen-3-VL-8B OCR Receipts
🚀1structured data parser from receipt images
- RunningAgentsFeatured262
Qwen3 Omni Demo
⚡262Chat with multimodal AI using text, audio, images, and video
- Running on ZeroAgentsFeatured115
VLM Object Understanding
🦀115Explore object detection, visual grounding, keypoint Detecti
- SleepingAgents2
Dataset Card Drafter
😻2Create dataset descriptions and open PRs automatically
- Running on CPU UpgradeAgents1.01k
Open VLM Leaderboard
🌎1.01kVLMEvalKit Evaluation Results Collection
- Running on ZeroAgentsFeatured465
DeepSeek OCR 2 Demo
🚀465Try out DeepSeek-OCR-2 on your PDFs or images
- Running on ZeroMCP68
Multimodal OCR3
🌖68Chandra-OCR / Nanonets-OCR2 / olmOCR-2 / Dots.OCR
-
Qwen/Qwen3-VL-30B-A3B-Instruct
Image-Text-to-Text • Updated • 1.13M • • 564
- Running on ZeroAgentsFeatured184
VibeVoice-Realtime-0.5B
🐨184Generate natural speech from text with selectable voices
-
microsoft/VibeVoice-1.5B
Text-to-Speech • 3B • Updated • 149k • 2.34k - RunningAgentsFeatured400
Qwen3 TTS Demo
🚀400Generate spoken audio from text in many voices and languages
-
mradermacher/Qwen3-1.7B-Multilingual-TTS-GGUF
2B • Updated • 4.35k • 10
-
openai/whisper-large-v3
Automatic Speech Recognition • 2B • Updated • 4.87M • • 5.61k -
openai/whisper-large-v3-turbo
Automatic Speech Recognition • 0.8B • Updated • 6.88M • • 2.96k - Running on ZeroMCPFeatured829
Whisper Large V3
🤫829Transcribe or translate audio and YouTube videos to text
- Running on ZeroAgentsFeatured90
Kugel Audio
👀90Generate natural-sounding speech in European languages with voice cloning