SpeechRecognition

This is an experimental technology
Because this technology's specification has not stabilized, check the compatibility table for usage in various browsers. Also note that the syntax and behavior of an experimental technology is subject to change in future versions of browsers as the specification changes.

The SpeechRecognition interface of the Web Speech API is the controller interface for the recognition service; this also handles the SpeechRecognitionEvent sent from the recognition service.

Constructor

SpeechRecognition.SpeechRecognition()
Creates a new SpeechRecognition object.

Properties

SpeechRecognition also inherits properties from its parent interface, EventTarget.

SpeechRecognition.grammars
Returns and sets a collection of SpeechGrammar objects that represent the grammars that will be understood by the current SpeechRecognition.
SpeechRecognition.lang
Returns and sets the language of the current SpeechRecognition. If not specified, this defaults to the HTML lang attribute value, or the user agent's language setting if that isn't set either.
SpeechRecognition.continuous
Controls whether continuous results are returned for each recognition, or only a single result. Defaults to single (false.)
SpeechRecognition.interimResults
Controls whether interim results should be returned (true) or not (false.) Interim results are results that are not yet final (e.g. the SpeechRecognitionResult.isFinal property is false.)
SpeechRecognition.maxAlternatives
Sets the maximum number of SpeechRecognitionAlternatives provided per result. The default value is 1.
SpeechRecognition.serviceURI
Specifies the location of the speech recognition service used by the current SpeechRecognition to handle the actual recognition. The default is the user agent's default speech service.

Event handlers

SpeechRecognition.onaudiostart
Fired when the user agent has started to capture audio.
SpeechRecognition.onaudioend
Fired when the user agent has finished capturing audio.
SpeechRecognition.onend
Fired when the speech recognition service has disconnected.
SpeechRecognition.onerror
Fired when a speech recognition error occurs.
SpeechRecognition.onnomatch
Fired when the speech recognition service returns a final result with no significant recognition. This may involve some degree of recognition, which doesn't meet or exceed the confidence threshold.
SpeechRecognition.onresult
Fired when the speech recognition service returns a result — a word or phrase has been positively recognized and this has been communicated back to the app.
SpeechRecognition.onsoundstart
Fired when any sound — recognisable speech or not — has been detected.
SpeechRecognition.onsoundend
Fired when any sound — recognisable speech or not — has stopped being detected.
SpeechRecognition.onspeechstart
Fired when sound that is recognised by the speech recognition service as speech has been detected.
SpeechRecognition.onspeechend
Fired when speech recognised by the speech recognition service has stopped being detected.
SpeechRecognition.onstart
Fired when the speech recognition service has begun listening to incoming audio with intent to recognize grammars associated with the current SpeechRecognition.

Methods

SpeechRecognition also inherits methods from its parent interface, EventTarget.

SpeechRecognition.abort()
Stops the speech recognition service from listening to incoming audio, and doesn't attempt to return a SpeechRecognitionResult.
SpeechRecognition.start()
Starts the speech recognition service listening to incoming audio with intent to recognize grammars associated with the current SpeechRecognition.
SpeechRecognition.stop()
Stops the speech recognition service from listening to incoming audio, and attempts to return a SpeechRecognitionResult using the audio captured so far.

Examples

In our simple Speech color changer example, we create a new SpeechRecognition object instance using the SpeechRecognition() constructor, create a new SpeechGrammarList, and set it to be the grammar that will be recognised by the SpeechRecognition instance using the SpeechRecognition.grammars property.

After some other values have been defined, we then set it so that the recognition service starts when a click event occurs (see SpeechRecognition.start().) When a result has been successfully recognised, the SpeechRecognition.onresult handler fires,  we extract the color that was spoken from the event object, and then set the background color of the <html> element to that colour.

var grammar = '#JSGF V1.0; grammar colors; public <color> = aqua | azure | beige | bisque | black | blue | brown | chocolate | coral | crimson | cyan | fuchsia | ghostwhite | gold | goldenrod | gray | green | indigo | ivory | khaki | lavender | lime | linen | magenta | maroon | moccasin | navy | olive | orange | orchid | peru | pink | plum | purple | red | salmon | sienna | silver | snow | tan | teal | thistle | tomato | turquoise | violet | white | yellow ;'
var recognition = new SpeechRecognition();
var speechRecognitionList = new SpeechGrammarList();
speechRecognitionList.addFromString(grammar, 1);
recognition.grammars = speechRecognitionList;
//recognition.continuous = false;
recognition.lang = 'en-US';
recognition.interimResults = false;
recognition.maxAlternatives = 1;
var diagnostic = document.querySelector('.output');
var bg = document.querySelector('html');
document.body.onclick = function() {
  recognition.start();
  console.log('Ready to receive a color command.');
}
recognition.onresult = function(event) {
  var color = event.results[0][0].transcript;
  diagnostic.textContent = 'Result received: ' + color;
  bg.style.backgroundColor = color;
}

Specifications

Specification Status Comment
Web Speech API
The definition of 'SpeechRecognition' in that specification.
Draft  

Browser compatibility

Feature Chrome Firefox (Gecko) Internet Explorer Opera Safari (WebKit)
Basic support 33webkit [1] No support [2] No support No support No support
continuous 33 [1] No support No support No support No support
Feature Android Chrome Firefox Mobile (Gecko) Firefox OS IE Phone Opera Mobile Safari Mobile
Basic support ? (Yes)[1] 44.0 (44) 2.5 No support No support No support
continuous ? (Yes)[1] ? No support No support No support No support
  • [1] Speech recognition interfaces are currently prefixed in Chrome, so you'll need to prefix interface names appropriately, e.g. webkitSpeechRecognition; You'll also need to serve your code through a web server for recognition to work.
  • [2] Can be enabled via the media.webspeech.recognition.enable flag in about:config on mobile. Not implemented at all on Desktop Firefox — see bug 1248897.

Firefox OS permissions

To use speech recognition in an app, you need to specify the following permissions in your manifest:

"permissions": {
  "audio-capture" : {
    "description" : "Audio capture"
  },
  "speech-recognition" : {
    "description" : "Speech recognition"
  }
}

You also need a privileged app, so you need to include this as well:

  "type": "privileged"

See also

Document Tags and Contributors

 Contributors to this page: Nickolay, chrisdavidmills
 Last updated by: Nickolay,