| --- |
| license: cc |
| datasets: |
| - speechlab/SPRING_INX_R1 |
| tags: |
| - ASR |
| - speech-recognition |
| --- |
| |
|
|
| # Fairseq Inference Setup and Usage |
|
|
| This repository provides a streamlined setup and guide for performing inference with Fairseq models, tailored for automatic speech recognition. |
|
|
| ## Table of Contents |
|
|
| 1. [Setup Instructions](#setup-instructions) |
| 2. [Download Required Models](#download-required-models) |
| 3. [Running Inference](#running-inference) |
| 4. [Getting Transcripts](#getting-transcripts) |
|
|
| --- |
|
|
| ### Setup Instructions |
|
|
| To set up the environment and install necessary dependencies for Fairseq inference, follow these steps. |
|
|
| #### 1. Create and Activate a Virtual Environment |
|
|
| Choose between Python's `venv` or Conda for environment management. |
|
|
| Using `venv`: |
| ```bash |
| python3.8 -m venv lm_env # use python3.8 or adjust for your preferred version |
| source lm_env/bin/activate |
| ``` |
|
|
| Using Conda: |
| ```bash |
| conda create -n fairseq_inference python==3.8.10 |
| conda activate fairseq_inference |
| ``` |
|
|
| #### 2. Install PyTorch and CUDA |
|
|
| Install the appropriate version of PyTorch and CUDA for your setup: |
| ```bash |
| pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html |
| ``` |
|
|
| If using Python 3.10.15 and CUDA 12.4: |
| ```bash |
| pip install torch+cu124 torchvision+cu124 torchaudio+cu124 -f https://download.pytorch.org/whl/cu124/torch_stable.html |
| ``` |
|
|
| #### 3. Install Additional Packages |
|
|
| ```bash |
| pip install wheel soundfile editdistance pyarrow tensorboard tensorboardX |
| ``` |
|
|
| #### 4. Clone the Fairseq Inference Repository |
|
|
| ```bash |
| git clone https://github.com/Speech-Lab-IITM/Fairseq-Inference.git |
| cd Fairseq-Inference/fairseq-0.12.2 |
| pip install --editable ./ |
| python setup.py build develop |
| |
| ``` |
|
|
| --- |
|
|
| ### Download Required Models |
|
|
| Download the necessary models for your ASR tasks. Place them in the appropriate directory (`model_path`). |
|
|
| ### Running Inference |
|
|
| Once setup is complete and models are downloaded, use the following command to run inference: |
|
|
| ```bash |
| python3 infer.py model_path audio_path |
| ``` |
|
|
| This script takes in the model directory and an audio file to generate a transcription. |
|
|
| ### Getting Transcripts |
|
|
| After running the inference script, you will receive the transcript for the provided audio file in the output. |