Skip to main content

Voice recognition

Iara's voice recognize work on two main modes: without intermediate results (interimResults it's false) and with intermediate results (interimResults it's true).

Both of the cases, the general functioning of recognition is the same and based in the calls start() and stop() as described in the next section.

General operation

Iara's voice recognize is based in two main commands calls: start() and stop().

Assuming that recognizer was started correctly, just call the method start() recognizer to start the voice recognize. When application consider that's time to finish recognize, ex. user released record button, just call the recognizer's stop() method.

Iara will not make new recognitions on the audio input device while start() don't be held down again (after stop() to complete recognize).

Tip check the section events to learn more about callbacks and audio recognize events.

The example below have an image that show you the general working:

recognition = new IaraSpeechRecognition();

recognition.init(
userId: 'meu@email.com',
apiToken: '197765800edb8affcb44a7ae7b4ff0a3'
).done(
// Call first recognition.start()
// after a period call recognition.stop().
);

recognition.onresult = function(event) {
var text = event.result.transcript;
console.log('recognized text: ' + text);
};

The parameter event received by callback onresult contains a propriety called result that's it an object IaraSpeechRecognitionResult. The propriety transcript this result contains the text that Iara recognized based on the input audio.

The attribute result of event too contains some information about to recognize, but the propriety result.transcript it's the one that contains recognized text.

Tip both the recognizes (with or without intermediate results) works on the calls start() and stop(). The only difference is the number of times that callback onresult or the eventresult are called.

Recognition types

No intermediate results - interimResults: false

What controls whether the recognizer will work with or without intermediate results is the attribute interimResultsinit(), as in the example below:

var recognition = new IaraSpeechRecognition();

// The attribute `interimResults` controls intermediate results
// voice recognize should or no to be play. If nothing is
// reported the value default is false (no intermediate results).
recognition.init({
userId: 'meu@email.com',
apiToken: '197765800edb8affcb44a7ae7b4ff0a3',
interimResults: false
});

When interimResults: false is use in init(), the callback onresult is automatic called and once as soon as stop() to be held down.

Below is an example of voice recognize without intermediate results:

var recognition = new IaraSpeechRecognition();

recognition.init({
userId: 'meu@email.com',
apiToken: '197765800edb8affcb44a7ae7b4ff0a3',
interimResults: false
}).done(function(e) {
// We configured buttons to enable/disable audio recognition.
botaoStart.addEventListener('click', function() {
recognition.start();
});

botaoStop.addEventListener('click', function() {
recognition.stop();
});
});

// Handle the callback `onresult`, called after `stop()` when Iara's
// Finish recognizing audio input.
recognition.onresult = function(event) {
var text = event.result.transcript
console.log('recognized text: ' + text);
};

Example above, after the recognize call the method done() that's indicate everything is right configurated, two buttons HTML are configurated to power on and power off the Iara's voice recognizer. The first button (whose id isbtnStart) is setted to activate the voice recognition when clicked. This is made through the button click event, that when is sent, makes a call to recognition.start(), starting the voice recognition. Similarly, the id button btnStop, when clicked, calls recognition.stop(), which ends the recognition operation.

Finally, at the end of the example, a role is assigned to the property onresult, that is linked to the event result. In case of recognize without intermediate results, the event (and callback onresult) is called only once, after stop() held down. The time between the call from onresult after stop() is undefined and depends on the processing power of the user's computer.

Below is a complete example of voice recognize without intermediate results:

<html>
<head>
<!-- Iara Speech SDK: https://developers.iarahealth.com/javascript/docs/geral-instalacao -->
<script src="https://cdn.iarahealth.com/sdk/javascript/1.9.0/iara-speech.min.js"></script>
</head>
<body>

<button id="btnStart">Iniciar</button>
<button id="btnStop">Parar</button>

<script type="text/javascript">
var myUserId = 'meu@email.com'; // use o seu userId
var myApiToken = '197765800edb8affcb44a7ae7b4ff0a3'; // use o seu apiToken

// Instantiates Iara's recognizer
var recognition = new IaraSpeechRecognition();

// initialize the SDK, i.e. authentication, check ALS, download voice model, etc.
recognition.init({
userId: myUserId,
apiToken: myApiToken,
interimResults: false // no intermediate recognition results
}).done(function(e) {
// The main event of voice recognition is the "onresult", called when
// some input audio is recognized.
recognition.onresult = function(event) {
// the parameter "event" contains a propriety named "result", that's an
// object IaraSpeechRecognitionResult. The propriety "transcript" of this
// results contains the text that's Iara's have recognized based on input audio.
var text = event.result.transcript.toLowerCase();
console.log('Recognized text: ' + text);
};

// We set up buttons to enable/disable audio recognition.
var botaoStart = document.getElementById('btnStart');
var botaoStop = document.getElementById('btnStop');

botaoStart.addEventListener('click', function() {
recognition.start();
});

botaoStop.addEventListener('click', function() {
recognition.stop();
});
}).fail(function(e) {
// a problem has occored. The "e" parameter contain much information about the error.
console.error('Some problem in the Iara initialization: ' +
e.errorMessage);
}).progress(function(e) {
// This method is called several times by SDK during the boot.
// You can use it to follow the steps of the startup, to report back
// to your user that your voice model is being downloaded, for example.
console.debug('Iara is initializing: ' + e.initType);
});
</script>
</body>

Intermetiate results - interimResults: true

Recognize with intermediate results work similar form that without intermediate results. When interimResults: true it's used ininit(), a calback onresult is automatic called repeatedly after start() will be called.

The event onresult is invoked repeatedly with a variable range, which depends on the user’s computer available computational resources. When stop() will be called, onresult may be triggered some more time, until eventually call stop.

Tip: you can know if a call onresult is the last one (final result after stop() be called) through the propriety isFinal passed event as a parameter to the callback that handles the results.

Below is an example of voice recognize with intermediate results:

var recognition = new IaraSpeechRecognition();

recognition.init({
userId: 'meu@email.com',
apiToken: '197765800edb8affcb44a7ae7b4ff0a3',
interimResults: true // intermediate results will produced
}).done(function(e) {
// we configurated buttons to able/unable the voice recognize.
var botaoStart = document.getElementById('btnStart');
var botaoStop = document.getElementById('btnStop');

botaoStart.addEventListener('click', function() {
recognition.start();
});

botaoStop.addEventListener('click', function() {
recognition.stop();
});
});

// handle the callback `onresult`, called constantly after `start()`. When `stop()`
// to be called, `onresult` may be called a few more times.
recognition.onresult = function(event) {
var text = event.result.transcript

// the propriety `isFinal` events received inform
// if that's the recognize final result (got after
// `stop()` be called).
if(event.isFinal) {
console.log('intermediate text: ' + text);
} else {
console.log('final text: ' + text);
}
};

Below is a complete example of voice recognize with intermediate results:

<html>
<head>
<!-- Iara Speech SDK: https://developers.iarahealth.com/javascript/docs/geral-instalacao -->
<script src="https://cdn.iarahealth.com/sdk/javascript/1.9.0/iara-speech.min.js"></script>
</head>
<body>

<button id="btnStart">Iniciar</button>
<button id="btnStop">Parar</button>

<script type="text/javascript">
var myUserId = 'meu@email.com'; // use your userId
var myApiToken = '197765800edb8affcb44a7ae7b4ff0a3'; // use your apiToken

// Instantiates Iara's recognizer
var recognition = new IaraSpeechRecognition();

// Initialize SDK, i.e. authentication, check ALS, download voice model, etc.
recognition.init({
userId: myUserId,
apiToken: myApiToken,
interimResults: true // Intermediate results will be produced

}).done(function(e) {
// The main event of voice recognition is the "onresult", called when
// some audio on the input is recognized.
recognition.onresult = function(event) {
// the parameter "event" contains a propriety named "result", that's an
// object IaraSpeechRecognitionResult. The propriety "transcript" of this
// results contains the text that's Iara's have recognized based on input audio.
var text = event.result.transcript;

// the propriety `isFinal` received event informs
// if that's the recognize's final result (got after
// `stop()` will called).
if(event.isFinal) {
console.log('Texto intermediário: ' + text);
} else {
console.log('Texto final: ' + text);
}
};

// We configurated buttons to able/unable the voice recognize.
var botaoStart = document.getElementById('btnStart');
var botaoStop = document.getElementById('btnStop');

botaoStart.addEventListener('click', function() {
recognition.start();
});

botaoStop.addEventListener('click', function() {
recognition.stop();
});
}).fail(function(e) {
// Some problem occurred. The "e" parameter contain a lot of information about the error.
console.error('Some problem in the Iara initialization: ' +
e.errorMessage);
}).progress(function(e) {
// This method is called several times by SDK during the boot.
// You can use it to follow the steps of the startup, to report back
// to your user that your voice model is being downloaded, for example.
console.debug('Iara is initializing: ' + e.initType);
});
</script>
</body>