Audio Formats and Prompts

Plum Voice has provided the following links that can assist in the navigation of audio files and compression.

For details about MP3 look here: https://en.wikipedia.org/wiki/MP3
For details about WAV look here: https://en.wikipedia.org/wiki/WAV
For a better explanation on audio lingo: http://www.stirlingaudioservices.com/gloss.htm
For details about mu-law or ( µ-law ) look: https://en.wikipedia.org/wiki/%CE%9C-law_algorithm

Here at plum to record audio files we utilize an external resource called audacity. On one hand this program is free and easy to use, however the program is deceptively in depth with the tools and plugins that are available for audio manipulation.

To download audacity visit this link and select your appropriate operating system: http://www.audacityteam.org/download/

Supported Audio Formats

When choosing your format keep in mind that our engine automatically runs the files through a compression µ-law algorithm. This means that all files will be down sampled to an 8-bit 8kHz. If you were to encode the files in a lower quality you would find the audio files you use would end up sounding better than your initial compression.

There are three audio formats for development that are always recommended (all audio files must be mono not stereo):

Sample Rate

Encoding

File Extension

8000 (8kHz)

16-bit linear PCM

.wav

8000 (8kHz)

8-bit µ-law WAV

.ul

32000 (32kps)*

16khz

.mp3

* Anything more would be overkill

The following audio formats are also supported for the <audio> tag:

Sample Rate

Encoding

File Extension

8000 (8kHz)

16-bit linear PCM headerless

.l16

8000 (8kHz)

8-bit µ-law encoded headerless

.ul

8000 (8kHz)

8-bit a-law encoded headerless

.al

8000 (8kHz)

8-bit a-law WAV

.wav

32000 (32kps)*

16khz

.mp3

* If the audio file is poorly encoded, the mp3 will not work

Prompt Queuing and Barge-in Behavior

Given a series of prompts to be queued and eventually played, all initial prompts that disallow barge-in will be treated as such. Once a prompt that allows barge-in is queued, all subsequent prompts will be queued for barge-in up to the point at which recognition triggers the playback of the queue. Prompt queuing, in fact, extends back beyond a given field's prompts, and will include all prompts in blocks declared immediately before the field.

Given a code snippet as follows:

<block>
  <prompt bargein="false">
    This is prompt one.
  </prompt>
  <prompt bargein="true">
    This is prompt two.
  </prompt>
</block>
<field type="digits">
  <prompt bargein="false">
    This is prompt three.
  </prompt>
  <prompt bargein="true">
    This is prompt four.
  </prompt>
</field>

“This is prompt one” would be queued and played immediately, and the user would not have the opportunity to barge in on the prompt. However, “This is prompt two”, “This is prompt three”, and “This is prompt four” would all be queued up together and played together, and the user would be able to barge in during any of the three prompts, including prompt three.

PreviousUse Plum's Transcription API NextGrammars and Speech Recognition

Last updated 5 years ago