LogoLogo
  • Go to Docs Center
  • Plum DEV Documentation
  • Overview
  • Developer Reference
    • Tutorial
    • How to...
      • Use Plum's Transcription API
    • Audio Formats and Prompts
    • Grammars and Speech Recognition
    • Available ASR Engines
    • TTS Engine Characteristics
      • Amazon Polly TTS Engine
        • Polly Voice Tag Attribute Details
      • AT&T Natural Voices
      • Cepstral Engine
      • RealSpeak Engine
      • Vocalizer 7
        • Vocalizer 7: <voice> tag and SSML Support
    • Data Exchange
    • Logging
    • Caching
    • Root Documents
  • VoiceXML
    • Tags
      • <assign>
      • <audio>
      • <block>
      • <break>
      • <catch>
      • <choice>
      • <clear>
      • <data>
      • <desc>
      • <disconnect>
      • <else>
      • <elseif>
      • <emphasis>
      • <enumerate>
      • <error>
      • <example>
      • <exit>
      • <field>
      • <filled>
      • <foreach>
      • <form>
      • <goto>
      • <grammar>
      • <help>
      • <if>
      • <initial>
      • <item>
      • <lexicon>
      • <link>
      • <log>
      • <mark>
      • <menu>
      • <meta>
      • <metadata>
      • <noinput>
      • <nomatch>
      • <one-of>
      • <option>
      • <paragraph>
      • <param>
      • <phoneme>
      • <prompt>
      • <property>
      • <prosody>
      • <record>
      • <reprompt>
      • <return>
      • <rule>
      • <ruleref>
      • <say-as>
      • <script>
      • <sentence>
      • <speak>
      • <sub>
      • <subdialog>
      • <submit>
      • <tag>
      • <throw>
      • <token>
      • <transfer>
      • <value>
      • <var>
      • <voice>
      • <vxml>
    • Properties
      • audiofetchhint
      • audiomaxage
      • audiomaxstale
      • bargein
      • bargeintype
      • certverifypeer
      • completetimeout
      • confidencelevel
      • datafetchhint
      • datamaxage
      • datamaxstale
      • documentfetchhint
      • documentmaxage
      • documentmaxstale
      • fetchaudio
      • fetchaudiodelay
      • fetchaudiominimum
      • fetchtimeout
      • grammarfetchhint
      • grammarmaxage
      • grammarmaxstale
      • incompletetimeout
      • inputmodes
      • interdigittimeout
      • logging
      • maxnbest
      • maxspeechtimeout
      • normalizeaudio
      • recordcall
      • recordcallappend
      • recordutterance
      • recordutterancetype
      • scriptfetchhint
      • scriptmaxage
      • scriptmaxstale
      • sensitivity
      • speedvsaccuracy
      • termchar
      • termmaxdigits
      • termtimeout
      • timeout
      • universals
      • voicegender
      • voicename
    • Application and Session Variables
      • application.lastresult$[i].confidence
      • application.lastresult$[i].inputmode
      • application.lastresult$[i].interpretation
      • application.lastresult$[i].recording
      • application.lastresult$[i].recordingduration
      • application.lastresult$[i].recordingsize
      • application.lastresult$[i].utterance
      • session.callrecording
      • session.id
      • session.telephone.ani
      • session.telephone.dnis
    • VoiceXML Resources
  • Plum DEV Guide
    • Using the Plum DEV site
    • Using the File Repository
    • Outbound Calling Guide
      • Using the Outbound Tools in the DEV web UI
      • DEV Outbound Programming Notes
      • Outbound FAQs and Tips
    • Call Reporting
    • Analytics
    • VoiceTrends
    • Debugging
    • Scratchpads
    • Saved URLs
    • Voice Biometrics
    • Call Routing
    • Data Security
      • 'Private' Tags
      • Managing Secure Phone Numbers
      • Sensitive Data Types
    • SMS Guide
      • Standard Short Codes
      • SMS Debugging/Error Logs
      • Additional SMS Info
    • Single Sign On
  • Plum DEV APIs
    • DEV Outbound APIs
      • Contacts CSV Formatting
      • Outbound API Parameter Notes
      • Legacy and Miscellaneous Notes
    • SMS API
    • Call Logs API
    • Call Scheduling and Pacing API
    • Transcription API
    • Application API
    • Blocklist API
Powered by GitBook
On this page
  • Supported Audio Formats
  • There are three audio formats for development that are always recommended (all audio files must be mono not stereo):
  • The following audio formats are also supported for the <audio> tag:
  • Prompt Queuing and Barge-in Behavior
  1. Developer Reference

Audio Formats and Prompts

PreviousUse Plum's Transcription APINextGrammars and Speech Recognition

Last updated 5 years ago

Plum Voice has provided the following links that can assist in the navigation of audio files and compression.

  • For details about MP3 look here:

  • For details about WAV look here:

  • For a better explanation on audio lingo:

  • For details about mu-law or ( µ-law ) look:

Here at plum to record audio files we utilize an external resource called audacity. On one hand this program is free and easy to use, however the program is deceptively in depth with the tools and plugins that are available for audio manipulation.

  • To download audacity visit this link and select your appropriate operating system:

Supported Audio Formats

When choosing your format keep in mind that our engine automatically runs the files through a compression µ-law algorithm. This means that all files will be down sampled to an 8-bit 8kHz. If you were to encode the files in a lower quality you would find the audio files you use would end up sounding better than your initial compression.

There are three audio formats for development that are always recommended (all audio files must be mono not stereo):

Sample Rate

Encoding

File Extension

8000 (8kHz)

16-bit linear PCM

.wav

8000 (8kHz)

8-bit µ-law WAV

.ul

32000 (32kps)*

16khz

.mp3

* Anything more would be overkill

The following audio formats are also supported for the <audio> tag:

Sample Rate

Encoding

File Extension

8000 (8kHz)

16-bit linear PCM headerless

.l16

8000 (8kHz)

8-bit µ-law encoded headerless

.ul

8000 (8kHz)

8-bit a-law encoded headerless

.al

8000 (8kHz)

8-bit a-law WAV

.wav

32000 (32kps)*

16khz

.mp3

* If the audio file is poorly encoded, the mp3 will not work

Prompt Queuing and Barge-in Behavior

Given a series of prompts to be queued and eventually played, all initial prompts that disallow barge-in will be treated as such. Once a prompt that allows barge-in is queued, all subsequent prompts will be queued for barge-in up to the point at which recognition triggers the playback of the queue. Prompt queuing, in fact, extends back beyond a given field's prompts, and will include all prompts in blocks declared immediately before the field.

Given a code snippet as follows:

<block>
  <prompt bargein="false">
    This is prompt one.
  </prompt>
  <prompt bargein="true">
    This is prompt two.
  </prompt>
</block>
<field type="digits">
  <prompt bargein="false">
    This is prompt three.
  </prompt>
  <prompt bargein="true">
    This is prompt four.
  </prompt>
</field>

“This is prompt one” would be queued and played immediately, and the user would not have the opportunity to barge in on the prompt. However, “This is prompt two”, “This is prompt three”, and “This is prompt four” would all be queued up together and played together, and the user would be able to barge in during any of the three prompts, including prompt three.

https://en.wikipedia.org/wiki/MP3
https://en.wikipedia.org/wiki/WAV
http://www.stirlingaudioservices.com/gloss.htm
https://en.wikipedia.org/wiki/%CE%9C-law_algorithm
http://www.audacityteam.org/download/