A GUI interface for Open AI Whisper based on Tauri and Sveltekit
Whiskey is a graphical user interface (GUI) for Open AI’s Whisper speech recognition system. It is built using Tauri and Sveltekit technologies and utilizes C++ binaries for Whisper. Whiskey provides users with the ability to transcribe audio or video files into written text, with real-time text highlighting during playback. It also offers features such as exporting transcriptions as .txt or .vtt files. This article will provide an analysis of Whiskey’s key features, installation guide, and a summary of its capabilities.
Planned features:
To install Whiskey, follow the steps below:
Clone the Whiskey repository from GitHub:
git clone https://github.com/whiskey/whiskey.git
Install the required dependencies:
cd whiskey
npm install
Build the application:
npm run build
Run the application:
npm run start
Access Whiskey by navigating to http://localhost:5000 in your web browser.
Whiskey is a user-friendly GUI for Open AI’s Whisper speech recognition system. It is built using Tauri and Sveltekit technologies and provides users with the ability to transcribe audio or video files into written text. The GUI offers real-time text highlighting during playback and allows for the export of transcriptions as .txt or .vtt files. With planned features like file renaming, drag and drop functionality, and support for different platforms, Whiskey aims to enhance the user experience and expand its usability. Overall, Whiskey is a powerful tool for transcription tasks that is easy to install and use.