Automatic Speech Recognition
Safetensors
Italian
whisper
italian
localai

whisper-tiny-it-multi

Fine-tuned openai/whisper-tiny (39M params) for Italian ASR on multiple datasets.

Author: Ettore Di Giacinto

Brought to you by the LocalAI team. This model can be used directly with LocalAI.

Usage with LocalAI

This model is ready to use with LocalAI via the whisperx backend.

Save the following as whisperx-tiny-it-multi.yaml in your LocalAI models directory:

name: whisperx-tiny-it-multi
backend: whisperx
known_usecases:
  - transcript
parameters:
  model: LocalAI-io/whisper-tiny-it-multi-ct2-int8
  language: it

Then transcribe audio via the OpenAI-compatible endpoint:

curl http://localhost:8080/v1/audio/transcriptions \
  -H "Content-Type: multipart/form-data" \
  -F file="@audio.mp3" \
  -F model="whisperx-tiny-it-multi"

Results

Evaluated on combined test set (Common Voice + MLS + VoxPopuli, 17,598 samples):

Step WER
1000 39.8%
3000 33.5%
5000 31.4%
10000 29.4%

Training Details

  • Base model: openai/whisper-tiny (39M parameters)
  • Datasets: Common Voice 25.0 Italian (173k) + MLS Italian (60k) + VoxPopuli Italian (23k) = 255k train samples
  • Steps: 10,000 (batch size 32)
  • Learning rate: 1e-5 with 500 warmup steps
  • Precision: bf16 on NVIDIA GB10

Usage

Transformers

from transformers import pipeline

pipe = pipeline("automatic-speech-recognition", model="LocalAI-io/whisper-tiny-it-multi")
result = pipe("audio.mp3", generate_kwargs={"language": "it", "task": "transcribe"})
print(result["text"])

CTranslate2 / faster-whisper

For optimized CPU inference: LocalAI-io/whisper-tiny-it-multi-ct2-int8

Links

Downloads last month
45
Safetensors
Model size
37.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LocalAI-io/whisper-tiny-it-multi

Finetuned
(1802)
this model

Datasets used to train LocalAI-io/whisper-tiny-it-multi