Sometimes, you might need to extract text from an image without manually typing it. But what if you can't easily copy the text from the image?
In this article, I will show you how to build an app using Reflex that will be able to extract text from images.
The following will be the output of the app:
Outline
- Create a new folder, open it with a code editor
- Create a virtual environment and activate
- Install requirements
- reflex setup
- reflex_ocr_system.py
- state.py
- style.py
- .gitignore
- run app
- conclusion
Create a new folder, open it with a code editor
Create a new folder and name it reflex_ocr_system
then open it with a code editor like VS Code.
Create a virtual environment and activate
Open the terminal. Use the following command to create a virtual environment .venv
and activate it:
python3 -m venv .venv
source .venv/bin/activate
Install requirements
We will need to install reflex
to build the app and also tesseract-ocr
pytesseract
Pillow
to help process the image and extract the text from the image.
Run the following command on the terminal:
sudo apt-get install tesseract-ocr
pip install reflex==0.2.9
pip install pytesseract==0.3.10
pip install Pillow==10.1.0
reflex setup
Now, we need to create the project using reflex. Run the following command to initialize the template app in reflex_ocr_system
directory.
reflex init
The above command will create the following file structure in reflex_ocr_system
directory:
You can run the app using the following command in your terminal to see a welcome page when you go to http://localhost:3000/ in your browser
reflex run
reflex_ocr_system.py
We need to build the structure and interface of the app. Go to the reflex_ocr_system
subdirectory and open the reflex_ocr_system.py
file. This is where we will add components to build the structure and interface of the app. Add the following code to it:
import reflex as rx
# import State and style
from reflex_ocr_system.state import State
from reflex_ocr_system import style
# color for the upload component
color = "rgb(107,99,246)"
def index():
"""The main view."""
return rx.vstack(
rx.heading("OCR System - Extract text from Images",style=style.topic_style),
rx.upload(
rx.vstack(
rx.button(
"Select File",
color=color,
bg="white",
border=f"1px solid {color}",
),
rx.text(
"Drag and drop files here or click to select files",
color="white",
),
),
multiple=False,
accept={
"image/png": [".png"],
"image/jpeg": [".jpg", ".jpeg"],
"image/gif": [".gif"],
"image/webp": [".webp"],
},
max_files=1,
disabled=False,
on_keyboard=True,
border=f"1px dotted {color}",
padding="5em",
),
rx.hstack(rx.foreach(rx.selected_files, rx.text,), color="white",),
rx.button(
"Click to Upload and Extract the text from selected Image",
on_click=lambda: State.handle_upload(
rx.upload_files()
),
is_loading=State.is_loading,
loading_text=State.loading_text,
spinner_placement="start",
),
rx.text(State.extracted_text_heading, text_align="center", font_weight="bold", color="white",),
rx.text(State.extracted_text, text_align="center",style=style.extracted_text_style),
)
# Add state and page to the app.
app = rx.App(style=style.style)
app.add_page(index)
app.compile()
The above code will render a text, an upload file component, the selected file name, a button, a text, and the extracted text.
state.py
Create a new file state.py
in the reflex_ocr_system
subdirectory and add the following code:
import reflex as rx
import pytesseract
from PIL import Image
class State(rx.State):
"""The app state."""
extracted_text_heading: str
extracted_text: str
is_loading: bool = False
loading_text: str = ""
async def handle_upload(
self, files: list[rx.UploadFile]
):
"""Handle the upload of files and extraction of text.
Args:
files: The uploaded files.
"""
# set the following values to spin the button and
# show text
self.is_loading = True
self.loading_text = "uploading and extracting text...."
yield
for file in files:
upload_data = await file.read()
outfile = rx.get_asset_path(file.filename)
# Save the file.
with open(outfile, "wb") as file_object:
file_object.write(upload_data)
# Open an image using Pillow (PIL)
image = Image.open(outfile)
# Use Tesseract to extract text from the image
text = pytesseract.image_to_string(image)
text = text.encode("ascii", "ignore")
self.extracted_text = text.decode()
self.extracted_text_heading = "Extracted Text👇"
# reset state variable again
self.is_loading = False
self.loading_text = ""
yield
The above code will get the uploaded file, save the file, and use Tesseract to extract the text from the image. is_loading
variable controls the spinning of the button, and loading_text
variable shows text when the button is spinning.
style.py
Create a new file style.py
in the reflex_ocr_system
subdirectory and add the following code. This will add styling to the page and components:
style = {
"background-color": "#454545",
"font_family": "Comic Sans MS",
"font_size": "16px",
}
topic_style = {
"color": "white",
"font_family": "Comic Sans MS",
"font_size": "3em",
"font_weight": "bold",
"box_shadow": "rgba(190, 236, 0, 0.4) 5px 5px, rgba(190, 236, 0, 0.3) 10px 10px",
"margin-bottom": "3rem",
}
extracted_text_style = {
"color": "white",
"text-align": "center",
"font_size": "0.9rem",
"width": "80%",
"display": "inline-block",
"display": "inline-block",
}
.gitignore
You can add the .venv
directory to the .gitignore
file to get the following:
*.db
*.py[cod]
.web
__pycache__/
.venv/
Run app
Run the following in the terminal to start the app:
reflex run
You should see an interface as follows when you go to http://localhost:3000/
You can upload an image and then click the button to extract the text of the image.
Conclusion
Note that the accuracy of OCR can vary based on the image quality, fonts, and languages used in the image.
You can get the code: https://github.com/emmakodes/reflex_ocr_system.git
To learn more about Reflex, you can read here: https://reflex.dev/
To install tesseract-ocr
on other platforms, you can check this solution: solutions to install tesseract-ocr
Top comments (1)
Thank you