Skip to content

A dictation application on linux using openai's whisper. Currently only used on KDE wayland.

License

Notifications You must be signed in to change notification settings

LumenYoung/Whisper-Dictation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Whisper-dictation

This is an app using openai's whisper to dictate on KDE wayland.

The project is designed as a dictation server that runs at background (To avoid the time to load model each time starts the dictation) and a client to toggle if the server should be recording. You can assign a shortcut to toggle the server to start/stop the recording (whisper_dicatation --language [language_code]).

Whenever the dictation is stopped, the content will be sent to your clipboard and a notification will be displayed.

The project depends on the kdialog package and wl-copy (from wl-clipboard package).

This project is designed to work on KDE wayland. Other wayland platforms might work as well, but without the ability to send a notification.

Installation

Install using pip:

pip install whisper_dictation

Note that it is recommended to install at user directory (not globally). Since the systemd service provided is written only for executable at ~/.local/bin

Usage

To start the project manually, you should use two terminals, for the server:

whisper_dictation daemon [--port 9000] [--model_name base]

See whisper_dictation daemon --help for all the available models

You can use a command to trigger the daemon, or assign a shortcut to this command in order to use it. Press once for start, and press the second time to stop the recording.

whisper_dictation say [--language en]

You should assign a language code, it can help with the performance especially using a small model.

Alternatively, you can use the systemd service unit provided inside this repo to make the daemon running in the background. Place it in your ~/.config/systemd/user/, enable and start it:

systemctl --user enable whisper_dictation
systemctl --user start whisper_dictation

TODO

  • add system integration for a shortcut to start/stop dictation
  • output the dictation to where the cursor is (planned as fcitx addon).
  • optional(A system tray)
  • package it on aur

Requirements

  1. wl-clipboard
  2. kdialog

About

A dictation application on linux using openai's whisper. Currently only used on KDE wayland.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages