Introduction
Web Speech API is one of useful API to translate speech to text. In order to use this API, user should have mic and mic permission on their browser.
Browser Support
js
if ('webkitSpeechRecognition' in window) {
console.log('Speech Recognition is avaialable')
} else {
console.log('Speech Recognition Not Available')
}
Checking browser supports is required
SpeechRecognition()
js
let recognition = new webkitSpeechRecognition();
The SpeechRecognition()
interface is not working on recent browser, so create instance with webkitSpeechRecognition()
Methods
js
recognition.start();
recognition.stop();
Name | Description |
---|---|
start() | Start to record |
stop() | Stop to record and return reulst |
abort() | Stop to record |
There is a difference between stop() method and abort() method. The stop() method returns speaking results. | |
On the other hands, the abort() method stop without any results. |
Event Listeners
js
recognition.onresult = function(event) {
// results handling
}
// When speech is end
recognition.onspeechend = function() {
recognition.stop();
}
Events
Name | Description |
---|---|
onresut | Event for result handling. The fring time is different depending on option. |
onspeechstart | When the speech that will be used for speech recognition has started |
onspeechend | When the speech that will be used for speech recognition has ended. |
onstart | When the recognition service has begun to listen to the audio with the intention of recognizing. |
onnomatch | GrammarList option is used and when words are matched |
onend | When the service has disconnected. The event must always be generated when the session ends no matter the reason for the end |
onerror | When a speech recognition error occur |
Configuration
Name | Type | Description |
---|---|---|
interimResults | boolean | Controls whether interim results are returned. When set to true, interim results should be returned |
continuous | boolean | When the continuous attribute is set to false, the user agent must return no more than one final result in response to starting recognition |
lang | string | Set the language of the recognition. It's using System langauge on default |
Examples
HTML
html
<div class="font-sans ">
<div class="flex">
<div>
Status:
</div>
<div id="status">
WAITING
</div>
</div>
<div class="border py-2">
<div class="flex">
<div>
Final:
</div>
<div id="final">
</div>
</div>
<div class="flex">
<div>
interim:
</div>
<div id="interim">
</div>
</div>
</div>
<div>
<button id="start-btn">
Start
</button>
<button id="stop-btn">
Stop
</button>
</div>
</div>
Script
js
// Check browser supports
if ("webkitSpeechRecognition" in window) {
// webkitSpeechRecognition Instance
let speechRecognition = new webkitSpeechRecognition();
// Final transcription will be here
let finalTranscript = "";
speechRecognition.continuous = true; // Continuse to record until stop button is clicked
speechRecognition.interimResults = true; // To display test results for InterimResults, make it true.
// Callback Function for the onStart Event
speechRecognition.onstart = () => {
document.querySelector("#status").innerHTML = 'START'
};
speechRecognition.onerror = () => {
document.querySelector("#status").innerHTML = 'ERROR'
};
speechRecognition.onend = () => {
document.querySelector("#status").innerHTML = 'WAITING'
};
speechRecognition.onresult = (event) => {
let interimTranscript = "";
// Loop through the results from the speech recognition object.
for (let i = event.resultIndex; i < event.results.length; ++i) {
// Result is consist of two dimention array
if (event.results[i].isFinal) {
finalTranscript += event.results[i][0].transcript;
} else {
interimTranscript += event.results[i][0].transcript;
}
}
// Update HTML
document.querySelector("#final").innerHTML = finalTranscript;
document.querySelector("#interim").innerHTML = interimTranscript;
};
// Start button click event handler
document.querySelector("#start-btn").onclick = () => {
// Start the Speech Recognition
speechRecognition.start();
};
// Stop button click event handler
document.querySelector("#stop-btn").onclick = () => {
// Stop the Speech Recognition
speechRecognition.stop();
};
} else {
console.log("Speech Recognition Not Available");
}
Copy and Past above code. Clicking the Start button starts to record and the stop button stops recoding
Advanced - Continous on mobile
Unfortunately, it's hard to use continous
option on mobile because of security issue. However, there is a small trick to solve this issue.
js
if ("webkitSpeechRecognition" in window) {
let speechRecognition = new webkitSpeechRecognition();
let finalTranscript = "";
// Flag for whether starts recognition again
let isContinous = false;
speechRecognition.continuous = false; // It's not working on mobile
speechRecognition.interimResults = true;
// Callback Function for the onStart Event
speechRecognition.onstart = () => {
document.querySelector("#status").innerHTML = 'START'
};
speechRecognition.onerror = () => {
document.querySelector("#status").innerHTML = 'ERROR'
};
speechRecognition.onend = () => {
document.querySelector("#status").innerHTML = 'WAITING'
};
speechRecognition.onspeechend = () => {
// Restart if flag is true
if(isContinous)
speechRecognition.start();
};
speechRecognition.onresult = (event) => {
let interimTranscript = "";
// Loop through the results from the speech recognition object.
for (let i = event.resultIndex; i < event.results.length; ++i) {
if (event.results[i].isFinal) {
finalTranscript += event.results[i][0].transcript;
} else {
interimTranscript += event.results[i][0].transcript;
}
}
document.querySelector("#final").innerHTML = finalTranscript;
document.querySelector("#interim").innerHTML = interimTranscript;
};
document.querySelector("#start-btn").onclick = () => {
speechRecognition.start();
isContinous = true;
};
document.querySelector("#stop-btn").onclick = () => {
speechRecognition.stop();
isContinous = false;
};
} else {
console.log("Speech Recognition Not Available");
}
The isContious
variable restarts again even though recoding is off automatically until user clicks stop button.