Milos Bijanic
biki96
AI & ML interests
None yet
Recent Activity
updated
a collection
6 days ago
OCR
updated
a collection
6 days ago
LLM
updated
a collection
6 days ago
OCR
Organizations
None yet
LLM
OCR
STT
-
Running on CPU UpgradeFeatured1.21k
Open ASR Leaderboard
🏆1.21kView and request speech models benchmark data
-
nvidia/canary-qwen-2.5b
Automatic Speech Recognition • 3B • Updated • 109k • 359 -
nvidia/parakeet-tdt-0.6b-v3
Automatic Speech Recognition • Updated • 87.3k • 584 -
nvidia/parakeet-tdt-0.6b-v2
Automatic Speech Recognition • Updated • 448k • 1.41k
Text2Image
IT3D
I2V
diffusion
-
DreamGaussian4D: Generative 4D Gaussian Splatting
Paper • 2312.17142 • Published • 19 -
Presto! Distilling Steps and Layers for Accelerating Music Generation
Paper • 2410.05167 • Published • 18 -
OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction
Paper • 2410.04932 • Published • 9 -
Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices
Paper • 2410.11795 • Published • 18
TTS
Text2Image
LLM
IT3D
OCR
I2V
STT
-
Running on CPU UpgradeFeatured1.21k
Open ASR Leaderboard
🏆1.21kView and request speech models benchmark data
-
nvidia/canary-qwen-2.5b
Automatic Speech Recognition • 3B • Updated • 109k • 359 -
nvidia/parakeet-tdt-0.6b-v3
Automatic Speech Recognition • Updated • 87.3k • 584 -
nvidia/parakeet-tdt-0.6b-v2
Automatic Speech Recognition • Updated • 448k • 1.41k
diffusion
-
DreamGaussian4D: Generative 4D Gaussian Splatting
Paper • 2312.17142 • Published • 19 -
Presto! Distilling Steps and Layers for Accelerating Music Generation
Paper • 2410.05167 • Published • 18 -
OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction
Paper • 2410.04932 • Published • 9 -
Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices
Paper • 2410.11795 • Published • 18