Hello guys,
In this tutorial, I will guide to building a Python program capable of converting an image to sound.
Throughout the tutorial, we will learn concepts of Optical character recognition (OCR) and Speech synthesis and later combining them to create a single working program.
Project Requirements
Installation
$ pip install Pillow
$ pip install gTTS
$ pip install pytesseract
Also in order pytesseract to work you have to install Google’s Tesseract-OCR Engine on your machine.
To install Tesseract Engine, CLICK HERE to get full instruction on installation with respect to your operating system
Now after Everything is installed, let’s start building our program
Project Folder
On your project folder, you should have a sample image containing text on it which we could use to test our program
.
├── app.py
└── image.jpg
0 directories, 2 files
Our project will be divided into two main parts
- Converting the image to text(Optical character Recognition)
- Converting Generated Text to speech (Speech Synthesis)
Converting Image to Text
At this stage, we use Python Library pytesseract to perform Optical character recognition which can be done in only one line of code.
But just before we begin to perform Optical Character recognition on our image, we need a way to load the image to the required format,
On this, we gonna use the Pillow library, Let's see how to convert image to text using pytesseract and pillow as shown in the example
Example of Usage
>>> from PIL import Image
>>> from pytesseract import image_to_string
>>> text = image_to_string(Image.open('image.jpg'))
'JOBS FILL\nYflUR POCKET.\nADVENTURES\nFILL YOUR\nLIFE.'
That’s how you can easily perform OCR in just 1 line of code, now let’s go see how can we convert it to speech using gTTS
Converting Generated Text to speech
There various ways you convert to speech to text in Python, in case you wanna review them all you can CLICK HERE
In this tutorial, We are going to use google text to speech to convert our decoded text into sound.
gTTS
The Syntax to performing text to speech is very simple you can also do it just one line of code as shown in the example below
>>> from gtts import gTTS
>>> gTTS('Coding is awesome trust me').save('sound.mp3')
Final program
I made the below simple program using the knowledge we just learned above with the addition of a cleaner function to remove \n in a generated text to make it easily convertible to sound.
from PIL import Image
from gtts import gTTS
from pytesseract import image_to_string
clean = lambda text : ' '.join(text.split('\n'))
to_text = lambda image: clean(image_to_string(Image.open(image)))
to_sound = lambda text: gTTS(text, lang='en').save('gene.mp3')
image_to_sound = lambda image: to_sound(to_text(image))
image_to_sound('image.jpg')
input()
When you run the above code, it will open our sample image, perform optical character recognition, clean generated text by removing \n, convert into sound by using gTTS
In case you find it interesting, don't be shy share it with your fellow friends on twitter and other social media.
The Original Article can be found on kalebujordan.dev
Kalebu / image-to-sound-python-
A python project for converting an Image into audible sound using OCR and speech synthesis
image-to-sound-python-
Intro
This repo will help you get started on how you can get started with Optical character recognition (OCR) and speech synthesis in python by building a simple project that will be converting an image into an audible sounds, combining both OCR and SPeech synthesis in one application
Full article
The full article for this source code can be found on my blog on an article named How to convert image to sound in Python .
Getting started
In order to use this code, firstly clone the repo using git or download the zip file manually
$-> git clone https://github.com/Kalebu/image-to-sound-python-
$->cd image-to-sound-python-
$ image-to-sound-python--> python app.py
Dependancies
In order to run this code you're supposed to have pytesseract and google text to sound libary installed on your machine, you can just use pip command to this.
-> pip install pytesseract
->
…
Top comments (0)