whisper-tiny-it-multi
Fine-tuned openai/whisper-tiny (39M params) for Italian ASR on multiple datasets.
Author: Ettore Di Giacinto
Brought to you by the LocalAI team. This model can be used directly with LocalAI.
Usage with LocalAI
This model is ready to use with LocalAI via the whisperx backend.
Save the following as whisperx-tiny-it-multi.yaml in your LocalAI models directory:
name: whisperx-tiny-it-multi
backend: whisperx
known_usecases:
- transcript
parameters:
model: LocalAI-io/whisper-tiny-it-multi-ct2-int8
language: it
Then transcribe audio via the OpenAI-compatible endpoint:
curl http://localhost:8080/v1/audio/transcriptions \
-H "Content-Type: multipart/form-data" \
-F file="@audio.mp3" \
-F model="whisperx-tiny-it-multi"
Results
Evaluated on combined test set (Common Voice + MLS + VoxPopuli, 17,598 samples):
| Step | WER |
|---|---|
| 1000 | 39.8% |
| 3000 | 33.5% |
| 5000 | 31.4% |
| 10000 | 29.4% |
Training Details
- Base model: openai/whisper-tiny (39M parameters)
- Datasets: Common Voice 25.0 Italian (173k) + MLS Italian (60k) + VoxPopuli Italian (23k) = 255k train samples
- Steps: 10,000 (batch size 32)
- Learning rate: 1e-5 with 500 warmup steps
- Precision: bf16 on NVIDIA GB10
Usage
Transformers
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="LocalAI-io/whisper-tiny-it-multi")
result = pipe("audio.mp3", generate_kwargs={"language": "it", "task": "transcribe"})
print(result["text"])
CTranslate2 / faster-whisper
For optimized CPU inference: LocalAI-io/whisper-tiny-it-multi-ct2-int8
Links
- CV-only version: LocalAI-io/whisper-tiny-it (WER 27.1% on CV test)
- CTranslate2 INT8: LocalAI-io/whisper-tiny-it-multi-ct2-int8
- Code: github.com/localai-org/italian-whisper
- LocalAI: github.com/mudler/LocalAI
- Downloads last month
- 45
Model tree for LocalAI-io/whisper-tiny-it-multi
Base model
openai/whisper-tiny