SpeechRecognition

Experimental: This is an experimental technology
Check the Browser compatibility table carefully before using this in production.

The SpeechRecognition interface of the Web Speech API is the controller interface for the recognition service; this also handles the SpeechRecognitionEvent sent from the recognition service.

Note: On some browsers, like Chrome, using Speech Recognition on a web page involves a server-based recognition engine. Your audio is sent to a web service for recognition processing, so it won't work offline.

Constructor

SpeechRecognition.SpeechRecognition()

Creates a new SpeechRecognition object.

Properties

SpeechRecognition also inherits properties from its parent interface, EventTarget.

SpeechRecognition.grammars

Returns and sets a collection of SpeechGrammar objects that represent the grammars that will be understood by the current SpeechRecognition.

SpeechRecognition.lang

Returns and sets the language of the current SpeechRecognition. If not specified, this defaults to the HTML lang attribute value, or the user agent's language setting if that isn't set either.

SpeechRecognition.continuous

Controls whether continuous results are returned for each recognition, or only a single result. Defaults to single (false.)

SpeechRecognition.interimResults

Controls whether interim results should be returned (true) or not (false.) Interim results are results that are not yet final (e.g. the SpeechRecognitionResult.isFinal property is false.)

SpeechRecognition.maxAlternatives

Sets the maximum number of SpeechRecognitionAlternatives provided per result. The default value is 1.

SpeechRecognition.serviceURI

Specifies the location of the speech recognition service used by the current SpeechRecognition to handle the actual recognition. The default is the user agent's default speech service.

Methods

SpeechRecognition also inherits methods from its parent interface, EventTarget.

SpeechRecognition.abort()

Stops the speech recognition service from listening to incoming audio, and doesn't attempt to return a SpeechRecognitionResult.

SpeechRecognition.start()

Starts the speech recognition service listening to incoming audio with intent to recognize grammars associated with the current SpeechRecognition.

SpeechRecognition.stop()

Stops the speech recognition service from listening to incoming audio, and attempts to return a SpeechRecognitionResult using the audio captured so far.

Events

Listen to these events using addEventListener() or by assigning an event listener to the oneventname property of this interface.

audiostart

Fired when the user agent has started to capture audio. Also available via the onaudiostart property.

audioend

Fired when the user agent has finished capturing audio. Also available via the onaudioend property.

end

Fired when the speech recognition service has disconnected. Also available via the onend property.

error

Fired when a speech recognition error occurs. Also available via the onerror property.

nomatch

Fired when the speech recognition service returns a final result with no significant recognition. This may involve some degree of recognition, which doesn't meet or exceed the confidence threshold. Also available via the onnomatch property.

result

Fired when the speech recognition service returns a result — a word or phrase has been positively recognized and this has been communicated back to the app. Also available via the onresult property.

soundstart

Fired when any sound — recognisable speech or not — has been detected. Also available via the onsoundstart property.

soundend

Fired when any sound — recognisable speech or not — has stopped being detected. Also available via the onsoundend property.

speechstart

Fired when sound that is recognized by the speech recognition service as speech has been detected. Also available via the onspeechstart property.

speechend

Fired when speech recognized by the speech recognition service has stopped being detected. Also available via the onspeechend property.

start

Fired when the speech recognition service has begun listening to incoming audio with intent to recognize grammars associated with the current SpeechRecognition. Also available via the onstart property.

Examples

In our simple Speech color changer example, we create a new SpeechRecognition object instance using the SpeechRecognition() constructor, create a new SpeechGrammarList, and set it to be the grammar that will be recognized by the SpeechRecognition instance using the SpeechRecognition.grammars property.

After some other values have been defined, we then set it so that the recognition service starts when a click event occurs (see SpeechRecognition.start().) When a result has been successfully recognized, the SpeechRecognition.onresult handler fires, we extract the color that was spoken from the event object, and then set the background color of the <html> element to that color.

var grammar = '#JSGF V1.0; grammar colors; public <color> = aqua | azure | beige | bisque | black | blue | brown | chocolate | coral | crimson | cyan | fuchsia | ghostwhite | gold | goldenrod | gray | green | indigo | ivory | khaki | lavender | lime | linen | magenta | maroon | moccasin | navy | olive | orange | orchid | peru | pink | plum | purple | red | salmon | sienna | silver | snow | tan | teal | thistle | tomato | turquoise | violet | white | yellow ;'
var recognition = new SpeechRecognition();
var speechRecognitionList = new SpeechGrammarList();
speechRecognitionList.addFromString(grammar, 1);
recognition.grammars = speechRecognitionList;
recognition.continuous = false;
recognition.lang = 'en-US';
recognition.interimResults = false;
recognition.maxAlternatives = 1;

var diagnostic = document.querySelector('.output');
var bg = document.querySelector('html');

document.body.onclick = function() {
  recognition.start();
  console.log('Ready to receive a color command.');
}

recognition.onresult = function(event) {
  var color = event.results[0][0].transcript;
  diagnostic.textContent = 'Result received: ' + color;
  bg.style.backgroundColor = color;
}

Specifications

Browser compatibility

Desktop Mobile
Chrome Edge Firefox Internet Explorer Opera Safari WebView Android Chrome Android Firefox for Android Opera Android Safari on IOS Samsung Internet
SpeechRecognition
33
You'll need to serve your code through a web server for recognition to work.
≤79
You'll need to serve your code through a web server for recognition to work.
No
No
No
14.1
4.4.3
You'll need to serve your code through a web server for recognition to work.
33
You'll need to serve your code through a web server for recognition to work.
No
No
14.5
2.0
You'll need to serve your code through a web server for recognition to work.
SpeechRecognition
33
≤79
No
No
No
14.1
37
Yes
No
No
14.5
Yes
abort
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
audioend_event
33
79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
audiostart_event
33
79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
continuous
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
end_event
33
79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
error_event
33
79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
grammars
33
≤79
No
No
No
No
Yes
Yes
No
No
No
Yes
interimResults
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
lang
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
maxAlternatives
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
nomatch_event
33
79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
onaudioend
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
onaudiostart
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
onend
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
onerror
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
onnomatch
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
onresult
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
onsoundend
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
onsoundstart
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
onspeechend
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
onspeechstart
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
onstart
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
result_event
33
79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
serviceURI
33
≤79
No
No
No
No
Yes
Yes
No
No
No
Yes
soundend_event
33
79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
soundstart_event
33
79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
speechend_event
33
79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
speechstart_event
33
79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
start
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
start_event
33
79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes
stop
33
≤79
No
No
No
14.1
Yes
Yes
No
No
14.5
Yes

See also

© 2005–2021 MDN contributors.
Licensed under the Creative Commons Attribution-ShareAlike License v2.5 or later.
https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition