Speech2Text

Speech2Text is a web application designed for transcribing speech to text. It provides an intuitive interface for users to upload audio files and receive their transcribed text output. The app employs powerful models from Hugging Face to ensure accurate and efficient transcription. It is capable of translating the audio clip upto 200+ languages. It uses Xenova Transformers which is a Hugging Face's transformers python library, meaning it can run the same pretrained models in user's browser.

Live View

Used Models

The Speech2Text application utilizes the following models for transcription:

Whisper Tiny (English): Model name: openai/whisper-tiny.en
Whisper Tiny: Model name: openai/whisper-tiny
Whisper Base: Model name: openai/whisper-base
Whisper Base (English): Model name: openai/whisper-base.en
Whisper Small: Model name: openai/whisper-small
Whisper Small (English): Model name: openai/whisper-small.en

Components

The components folder include Transcribing, Transcription, and Translation, each serving a specific role in the transcription process. Additionally, utility files such as presets.js, translate.worker.js, and whisper.worker.js handle preset configurations, translation, and transcription tasks asynchronously.

Transcribing.jsx: This component manages the display of the transcription process, providing visual feedback on the progress of transcription.
Transcription.jsx: Responsible for displaying the transcribed text.
Translation.jsx: Enables translation of the transcribed text into different languages, allowing users to select the target language and trigger the translation process.

Utilities

1.presets.js: Contains preset configurations used across the project, including message types, loading status, model names, and supported languages.

2.translate.worker.js: Handles translation tasks asynchronously using the Xenova transformers library.

3.whisper.worker.js: Manages the transcription process asynchronously using the Xenova transformers library for automatic speech recognition.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
public		public
src		src
.eslintrc.cjs		.eslintrc.cjs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
vite.config.js		vite.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech2Text

Live View

Used Models

Components

Utilities

Dependencies

Usage

License

About

Releases

Packages

Languages

License

tanbiralam/Speech2Script

Folders and files

Latest commit

History

Repository files navigation

Speech2Text

Live View

Used Models

Components

Utilities

Dependencies

Usage

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages