Creating a Speech Recognition App in Angular

Published in

codeburst

4 min readAug 11, 2020

We can extract text data from a speech by using speech recognition methods. There are many ways to carry out speech recognition in Angular, however, I’d like to focus on a simple method for this.

Here we use “Web Speech API” to recognize speech. Unfortunately, this API is only supported for a few browsers so I will list the supported browsers below:

Google chrome
Chrome for Android
Samsung Internet
QQ Browser
Baidu Browser

You can test this app in a browser from this list.

Okay, let’s begin. First of all, we have to create a new Angular project by using the below command in the terminal. I assume that you have installed Angular-CLI, but if you haven't then the below command won’t work.

ng g new voice-recognition
cd ./voice-recognition

Next, we create a new service for speech recognition. This service can be reused in multiple components.

ng g s service/voice-recognition

Create Service

Here we create service files in a separate directory called “service” as a best practice. Now we call webkitSpeechRecognition API in here.

Code review

Let’s review this code. First, we declare “webkitSpeechRecognition ” as per the below line. Otherwise Angular will not recognize it because “webkitSpeechRecognition” is not a library.

declare var webkitSpeechRecognition: any;

Next, we create and a new instance of webkitSpeechRecognition and initialize it by passing values to properties of webkitSpeechRecognition.

recognition =  new webkitSpeechRecognition();..........this.recognition.interimResults = true;
this.recognition.lang = 'en-US';

We enable return interim results and set language as our need. I will list the supported languages below that you can set:

Afrikaans af
Basque eu
Bulgarian bg
Catalan ca
Arabic (Egypt) ar-EG
Arabic (Jordan) ar-JO
Arabic (Kuwait) ar-KW
Arabic (Lebanon) ar-LB
Arabic (Qatar) ar-QA
Arabic (UAE) ar-AE
Arabic (Morocco) ar-MA
Arabic (Iraq) ar-IQ
Arabic (Algeria) ar-DZ
Arabic (Bahrain) ar-BH
Arabic (Lybia) ar-LY
Arabic (Oman) ar-OM
Arabic (Saudi Arabia) ar-SA
Arabic (Tunisia) ar-TN
Arabic (Yemen) ar-YE
Czech cs
Dutch nl-NL
English (Australia) en-AU
English (Canada) en-CA
English (India) en-IN
English (New Zealand) en-NZ
English (South Africa) en-ZA
English(UK) en-GB
English(US) en-US
Finnish fi
French fr-FR
Galician gl
German de-DE
Hebrew he
Hungarian hu
Icelandic is
Italian it-IT
Indonesian id
Japanese ja
Korean ko
Latin la
Mandarin Chinese zh-CN
Traditional Taiwan zh-TW
Simplified China zh-CN ?
Simplified Hong Kong zh-HK
Yue Chinese (Traditional Hong Kong) zh-yue
Malaysian ms-MY
Norwegian no-NO
Polish pl
Pig Latin xx-piglatin
Portuguese pt-PT
Portuguese (brasil) pt-BR
Romanian ro-RO
Russian ru
Serbian sr-SP
Slovak sk
Spanish (Argentina) es-AR
Spanish(Bolivia) es-BO
Spanish( Chile) es-CL
Spanish (Colombia) es-CO
Spanish(Costa Rica) es-CR
Spanish(Dominican Republic) es-DO
Spanish(Ecuador) es-EC
Spanish(El Salvador) es-SV
Spanish(Guatemala) es-GT
Spanish(Honduras) es-HN
Spanish(Mexico) es-MX
Spanish(Nicaragua) es-NI
Spanish(Panama) es-PA
Spanish(Paraguay) es-PY
Spanish(Peru) es-PE
Spanish(Puerto Rico) es-PR
Spanish(Spain) es-ES
Spanish(US) es-US
Spanish(Uruguay) es-UY
Spanish(Venezuela) es-VE
Swedish sv-SE
Turkish tr
Zulu zu

Then we call “event listener” for getting identified words and assign those words to the “tempWords” variable to access later.

this.recognition.addEventListener('result', (e) => {      
const transcript = Array.from(e.results)
.map((result) => result[0])
.map((result) => result.transcript)
.join('');
this.tempWords = transcript; 
 });

Here we are going to start reviewing the next part which starts voice recognition continuously.

start() {
this.isStoppedSpeechRecog = false;
this.recognition.start();
console.log("Speech recognition started")    this.recognition.addEventListener('end', (condition) => {
if (this.isStoppedSpeechRecog) {
his.recognition.stop();
console.log("End speech recognition")
} else {
this.wordConcat();
this.recognition.start();
}
});
}

We call it the “event listener” which listens to the end event. Here we recall to start voice recognition in order to run service until it’s asked to stop service. “isStoppedSpeechRecog” variable is the conditional variable that decides whether the service should be stopped or continued. Otherwise, this service will automatically stop when the user is silent.

When it comes to stopping voice recognition, the below code snippet will do it:

stop() {
this.isStoppedSpeechRecog = true;
this.wordConcat()
this.recognition.stop();
console.log("End speech recognition")
}

I wrote a method called “wordConcat()” which contacts all recognized speeches into one paragraph:

wordConcat() {
this.text = this.text + ' ' + this.tempWords + '.';    this.tempWords = ''; 
 }

Create the component

Now we have reviewed the service file, next we have to call this service in a component. You can create a new component through using the below command:

ng g c speech-to-text

Then it will create a new typescript file and HTML file. You can enter the below code to call the service:

This is the html file.

Mind and include the below tag in app.component.html in Angular:

<app-speech-to-text></app-speech-to-text>

You can use the below command to run the app:

 ng serve -o

Conclusion

Finally, you have a voice recognition Angular app! Congratulations. This link redirects to the tested app I have coded. Here are a few applications of this method:

Analyzing user readability.
Real-time captioning/subtitle in voice conference.
Voice typing

Try to apply this and make value up. I look forwards to seeing you in another tutorial. Thank you for reading. Happy Coding!

Reference

developers.google.com

codeburst

Creating a Speech Recognition App in Angular

Create Service

Code review

Create the component

Conclusion

Reference

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Published in codeburst

Written by Donishka Tharindu

Responses (7)