This is a concept I have for a faster and simpler voice model. It uses tesorflow and java and I am hoping on implementing this in ara, an app I am working on.
In my basic version of the ara voice recognition there is this code:
switch (labelIndex - 2) {
case 0:
resulttxt = "yes";
break;
case 1:
resulttxt = "no";
break;
case 2:
resulttxt = "up";
break;
case 3:
resulttxt = "down";
break;
case 4:
resulttxt = "left";
break;
case 5:
resulttxt = "right";
break;
case 6:
resulttxt = "on";
break;
case 7:
resulttxt = "off";
break;
case 8:
resulttxt = "stop";
break;
case 9:
resulttxt = "go";
break;
}
My idea take this basic concept but instead of full words looks for sounds an then words. this could be the new recognition code:
switch (labelIndex - 2) {
case 0:
resulttxt = resulttxt + "LongA";
break;
case 1:
resulttxt = resulttxt + "ShortA";
break;
case 2:
resulttxt = resulttxt +"ShortB";
break;
case 3:
resulttxt = resulttxt +"LongB"; // Such as the words be and bee.
break;
case 4:
resulttxt = resulttxt +"ShortC";
break;
// .......................................................
}
Please give your input, I am no expert on AI, so if its dumb tell me.
Top comments (1)
My goal with this is a small in size tflite file that can: