DEV Community

Cover image for Journal Transcriber: Write journal by dictating it
Slobi
Slobi

Posted on

Journal Transcriber: Write journal by dictating it

Hello

Dear readers, today I am going to talk about a journal script I kind of wrote. 😉

The Problem of Saving My Thoughts

The problem I’m trying to solve is that I want to save my thoughts.

I have no problem reading what I wrote, but I don’t enjoy waiting. I can dictate, but I don’t want to save or listen to my voice.

Whenever I encounter such a situation, I get into engineering mode, and if it’s something I can tackle within a few hours of work, I go for it.

Initial Research

First, I researched a voice-to-text library that’s easy to use, and I found Vosk. It has a huge library of models. I opted for two small ones because I want to use the app while I code. They give somewhat decent results.

The Python Solution

Then, with the magic of multiple AI models, I came to a solution in Python. It streams my microphone and system sound to the Vosk model, which provides a transcription written with timestamps in a file with the current date.

From Prototype to Daily Usability

It serves the purpose, but it’s not convenient for daily use. One of my mottoes is: if it’s not easy and instant, I won’t use it. So, I packed the script into a Python module and wrote a *.desktop file to register it as a regular Linux application in my case, Pop!_OS.
Just a quick extra touch was adding a keyboard shortcut and behold the miracle it works!
Notifications using notify-send are there to let you know the app's current state.

Eliminating Console Clutter

One thing that irritates me is when an application runs in the console because it clutters my workspace. To avoid this, I needed a simple way to start and stop the app without relying on the terminal. My solution was to implement a lock file system.

When the app starts, it creates a lock file containing its process ID (PID). If the lock already exists, the script uses it to send a KeyboardInterrupt signal to stop the running instance and exits. This way, the first call starts the app and begins transcribing, while the second call stops it.

Solving Problems the Inventive Way

I hope this article sparked someone’s wish to solve their own problem in a unique, inventive, and somewhat polished way.

Feel free to check my other similar article:
Automating Text Extraction from Screenshots

Also feel free to check out code:
on Github

Have a great day 🚀

Top comments (0)