Skip to content

Using vosk speech recognition toolkit to be able to transcribe audio offline.

Notifications You must be signed in to change notification settings

andrewymin/audio-to-text

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Audio to Text

Python program that will transcribe mp3 files to text whether you're online or offline.

Getting Started

First create a folder to place cloned git project into. Inside this folder also create a folder called “vosk_lang” to place vosk language models into. Go to the link: https://alphacephei.com/vosk/models To use Vosk models. Choose your desired language to transcribe in. In this repo, it is set up to use vosk-model-en-us-0.22. This can later be changed in the project files to suit your needs. Download the model into the folder that was created earlier called “vosk_lang” to be used later during set up.

Prerequisites

Have an IDE to run python such as pycharm. Have the Vosk model folder downloaded for use. Have git installed on the computer. A mp3 audio file to transcribe.

Installing

  1. Go to your preferred python IDE.
  2. Then open up the terminal in that IDE.
  3. Navigate to the folder you created earlier for the git project.
  4. Once in said folder go to github and click on the green code button and choose the method of download (Usually it will be HTTPS)
  5. Copy link and go back to the IDE terminal and type “git clone {place coped url here}”.
  6. After hitting enter the repo will be cloned into that folder.
  7. Once done downloading, open project with your IDE and download the necessary packages.
  8. In the root of the project create 3 folders.
    • audio_files
    • results
    • vosk_lang
  9. Inside the “vosk_lang” folder place the downloaded and unzipped model from vosk here.

Deployment

After the setup is complete, run the main.py file. The conversion will take a few minutes depending on model type. While running the program will sound a notification sound to signify the completion of the program.

Built With

Python - Programming language used Vosk - Translation library

Authors

Andy Min - Creator

About

Using vosk speech recognition toolkit to be able to transcribe audio offline.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages