Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
adarshzolekar
's Collections
Multimodal AI Models
Audio & Speech Models
Vision Models (Image & Video)
Text & Code Models (NLP)
Audio & Speech Models
updated
Jan 23
Purpose: Speech recognition, text-to-speech, music, audio analysis.
Upvote
1
openai/whisper-large-v3
Automatic Speech Recognition
•
Updated
Aug 12, 2024
•
5.86M
•
•
5.45k
facebook/wav2vec2-base-960h
Automatic Speech Recognition
•
94.4M
•
Updated
Nov 14, 2022
•
3.5M
•
391
coqui/XTTS-v2
Text-to-Speech
•
Updated
Dec 11, 2023
•
7.51M
•
3.42k
microsoft/speecht5_tts
Text-to-Speech
•
Updated
Nov 8, 2023
•
111k
•
823
facebook/musicgen-small
Text-to-Audio
•
Updated
Nov 17, 2023
•
153k
•
480
Upvote
1
Share collection
View history
Collection guide
Browse collections