stt-research-scripts

Python scripts and experiments used for my research paper on speech-to-text AI models. This repository includes model evaluations, accuracy benchmarks, and transcript quality comparisons using various open-source and cloud-based STT APIs.

> Place your audio file in the same directory and update `audio_path` in the scripts.

Setup and run Whisper

Follow these steps to set up and run the Whisper transcription script:

1. Use Python 3.10

Make sure you have Python 3.10 installed on your machine.

⚠️ Whisper requires Python 3.10 due to dependency compatibility. Using other versions (like 3.11 or 3.12) may cause installation issues or runtime errors.

2. Create a virtual environment

python3.10 -m venv whisper-venv

3. Activate the virtual environment

source whisper-venv/bin/activate

4. Install dependencies

pip install openai-whisper

5. Run the transcription script

python3 stt-whisper-model.py

6. Deactivate the virtual environment

deactivate

Setup and run Vosk

1. Download the English Vosk Model (lightweight)

wget https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
unzip vosk-model-small-en-us-0.15.zip
mv vosk-model-small-en-us-0.15 vosk-model

This downloads and extracts the model to a folder named vosk-model which the script expects.

2. Create a virtual environment

python3.10 -m venv vosk-venv

3. Activate the virtual environment

source vosk-venv/bin/activate

4. Install dependencies

pip install vosk

5. Run the transcription script

python3 stt-vosk-model.py path/to/your/audio.wav

Replace path/to/your/audio.wav with the path to your WAV file (mono, 16kHz).

6. Output files

After running, two files will be generated in the script directory:

transcription.json — full transcription plus word-level timestamps transcription.txt — full transcription as plain text

7. Deactivate the virtual environment

deactivate

Setup and run AssemblyAI

1. Create a virtual environment

python3.10 -m venv assemblyai-venv

2. Activate the virtual environment

source assemblyai-venv/bin/activate

3. Install dependencies

pip install requests, dotenv

4. Run the transcription script

python3 stt-assemblyai-model.py

5. Deactivate the virtual environment

deactivate

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
README.md		README.md
stt-assemblyai-model.py		stt-assemblyai-model.py
stt-vosk-model.py		stt-vosk-model.py
stt-whisper-model.py		stt-whisper-model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

stt-research-scripts

Setup and run Whisper

1. Use Python 3.10

2. Create a virtual environment

3. Activate the virtual environment

4. Install dependencies

5. Run the transcription script

6. Deactivate the virtual environment

Setup and run Vosk

1. Download the English Vosk Model (lightweight)

2. Create a virtual environment

3. Activate the virtual environment

4. Install dependencies

5. Run the transcription script

6. Output files

7. Deactivate the virtual environment

Setup and run AssemblyAI

1. Create a virtual environment

2. Activate the virtual environment

3. Install dependencies

4. Run the transcription script

5. Deactivate the virtual environment

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

stt-research-scripts

Setup and run Whisper

1. Use Python 3.10

2. Create a virtual environment

3. Activate the virtual environment

4. Install dependencies

5. Run the transcription script

6. Deactivate the virtual environment

Setup and run Vosk

1. Download the English Vosk Model (lightweight)

2. Create a virtual environment

3. Activate the virtual environment

4. Install dependencies

5. Run the transcription script

6. Output files

7. Deactivate the virtual environment

Setup and run AssemblyAI

1. Create a virtual environment

2. Activate the virtual environment

3. Install dependencies

4. Run the transcription script

5. Deactivate the virtual environment

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages