Hi everyone, I would like to introduce you to my own voice assistant, Selena. Before you start, you can find the codes here. This assistant is designed to speak Turkish, but you can also convert it to your own language by changing the commands and answers.
Mainly used modules:
Speech Recognition: I think this is one of the most important modules, because we are making a voice assistant and the assistant has to understand what we are saying. This module does this job by perceiving what we are saying.
gTTS: This module allows the assistant to talk to us. You can find many alternatives for this, but I prefer this one because it is one of the rare modules that support Turkish.
WolframAlpha: Used to calculate expert level responses using Wolfram's algorithms, knowledge base and AI technology. You can find detailed information in its own document.
Wikipedia: As we all know, Wikipedia is a great source of information, we used the Wikipedia module to get information from Wikipedia or to search for Wikipedia.
Web browser: This module used to perform Web Search comes built-in with Python.
Requests: Requests are used to make GET and POST requests.
Google translate: This module allows us to use the clever translation of google.
Playsound: This module allows us to play various audio files.
import json
import webbrowser
from gtts import gTTS
from playsound import playsound
import speech_recognition as sr
from random import choice, randrange
import os
import sys
import datetime
from googletrans import Translator
import wikipedia
import wolframalpha as wa
import osascript
import speedtest
Let's take a look at the general operation. First, we check the internet connection and the program continues as long as there is an internet connection. Secondly, we determine when Selena will receive commands from us, we will use the keyword "Selena" here. The program will be listening to us all the time, but when it receives the word "Selena" it will start listening to us for commands.
isSelena = False
while check_internet():
with sr.Microphone() as source:
r.adjust_for_ambient_noise(source)
print("Arka plan gürültüsü:" + str(r.energy_threshold))
audio = r.listen(source)
data = ""
try:
if(isSelena == False):
data = r.recognize_google(audio, language='tr')
print(data)
sound = data.upper()
soundBlocks = sound.split()
if "SELENA" in soundBlocks:
listen()
time.sleep(1)
except sr.UnknownValueError:
print(sr.UnknownValueError)
Commands are processed in a separate file named Commands.py. After detecting what the user said, the findCommand() method is executed to determine which command to perform.
def findCommand(self):
i = 0
for command in self.commands:
if command in self.soundBlocks:
self.commandRun(command)
break
else:
i = i+1
if len(self.commands) == i:
answer = self.wolframSearch(self.sound)
if(answer != "404"):
self.speak(answer)
else:
self.commandRun("ANLAMADIM")
If the keywords match to run the command, the commandRun() method is run to run the relevant command. If the keywords do not match, help is received from the Wolfram API. If a reasonable response is received from the API, it is transmitted to the user.
You can review the rest of the code yourself on github. In the future, I will publish articles describing the project in detail. It was a fun work created using various modules. The project is still not completed. So it will continue to evolve. I look forward to your comments and suggestions. I wonder what kind of improvements can be made to make it run in a more optimized way in particular?
Top comments (0)