AT&T Natural Voices

Voice Tag Attributes

<gender>

This attribute is supported.

<age>

This attribute is not supported.

<name>

If you have an onsite system, please contact your sales account manager for which of these voices you have installed on your server.

Language

Name

Gender

US

UK

American English (en_us)

Mel

male

x

-

American English (en_us)

Mike

male

x

x

American English (en_us)

Ray

male

x

x

American English (en_us)

Rich

male

x

-

American English (en_us)

Claire

female

x

-

American English (en_us)

Crystal

female

x

x

American English (en_us)

Julia

female

x

-

American English (en_us)

Lauren

female

x

x

Spanish (es_us)

Alberto

male

x

-

Spanish (es_us)

Rosa

female

x

-

British English (en_uk)

Charles

male

x

x

British English (en_uk)

Anjali

female

x

-

British English (en_uk)

Audrey

female

x

x

French (fr_fr)

Alain

male

x

-

French (fr_fr)

Juliette

female

x

-

German (de_de)

Reiner

male

x

x

German (de_de)

Claudia

female

x

x

If no name is specified, mike is the default voice for the US AT&T Natural Voices while charles is the default voice for the UK AT&T Natural Voices.

<speak>

The <speak> tag should be used to specify the desired language through the attribute xml:lang=”lg-CN”, where lg-CN is the language-country pair specified in the Language column from the table of supported languages.

Please note that each voice has an associated language. Selecting a language that is not associated with the voice will result in unpredictable behavior; however, in many cases, you will hear the language the text was written in accented by that voice’s associated language.

<voice>

The <voice> tag should be used to specify the desired voice through the attribute name=”name”, where name is the voice specified in the Name/ID column for the table of supported voices above.

If you want to use another voice within an app, specify it using the <speak> and <voice> tags within the prompt block:

<?xml version="1.0"?>
<vxml version="2.0">
 <form>
  <block>
   <prompt>
    <speak xml:lang="es-MX">
    <voice name="Mia" variant="1">
     Hello, thank you for calling Plum Voice.
    </voice>
    </speak>
   </prompt>
  </block>
 </form>
</vxml>

To sequentially use multiple languages and voices within a <prompt> block, use multiple <speak> and <voice> blocks. For example:

<?xml version="1.0"?>
<vxml version="2.0">
 <form>
  <block>
   <prompt>
    <speak xml:lang="en-US">
    <voice name="Joanna" variant="2">
     Press one to continue in English.
    </voice>
    </speak>
    <speak xml:lang="es-US">
    <voice name="Lupe" variant="2">
     Presione dos para continuar en español.
    </voice>
    </speak> 
    <speak xml:lang="fr-FR">
    <voice name="Celine" variant="standard">
     Appuyez sur trois pour continuer en français.
    </voice>
    </speak>
   </prompt>
  </block>
 </form>
</vxml>

<xml:lang>

If you have an onsite system, please contact your sales account manager for which of these languages you have installed on your server.

The following languages are supported by their respective engines:

Language

Code

US

UK

German

de_de

X

X

British English

en_uk

X

X

American English

en_us

X

X

Spanish

es_es

X

X

French

fr_fr

X

X

For example, use <voice xml:lang="en_us"> to hear an American speaker.

SSML Tags

An “x” marks that the Child Tag is supported by the speech engine. An asterisk (*) means that there are notes to explain the difference between the speech engines.

Child Tag

AT&T Natural Voices

<break>*

x

<emphasis>

<enumerate>

x

<mark>

<paragraph>*

x

<phoneme>*

x

<prosody>*

x

<say-as>*

x

<sentence>*

x

<speak>

x

<sub>

x

<value>

x

<break>

This tag works as expected for the AT&T engine.

<paragraph>

This tag works as expected for the AT&T engine.

<phoneme>

This tag works uses the Phoneme Set shown below.

Phoneme Set for AT&T Natural Voices:

Phoneme

Example

Transcription

aa

Bob

b aa b 1

ae

bat

b ae t 1

ah

but

b ah t 1

ao

bought

b ao t 1

aw

down

d aw n 1

ax

about

ax 0 b aw t 1

ay

bite

b ay t 1

b

bet

b eh t 1

ch

church

ch er ch 1

d

dig

d ih g

dh

that

dh ae t 1

dx

butter

b ah 1 dx er 0

eh

bet

b eh t 1

em

Chatham

ch ae 1 dx em 0

en

satin

s ae 1 q en 0

er

bird

b er d 1

ey

bait

b ey t 1

f

fog

f ao g 1

g

got

g aa t 1

hh

hot

h aa t 1

ih

bit

b ih t 1

iy

beat

b iy t 1

jh

jump

jh ah m p 1

k

cat

k ae t 1

l

lot

l aa t 1

m

Mom

m aa m 1

n

nod

n aa d 1

ng

sing

s ih ng 1

ow

boat

b ow t 1

oy

boy

b oy 1

p

pot

p aa t 1

q

button

b ah 1 q en 0

r

rat

r ae t 1

s

sit

s ih t 1

sh

shut

sh ah t 1

t

top

t aa p 1

th

thick

th ih k 1

uh

book

b uh k

uw

boot

b uw t 1

v

vat

v ae t 1

w

won

w ah n 1

y

you

y uw 1

z

zoo

z uw 1

zh

measure

m eh 1 zh er

Key

0 Unstressed 1 Primary stress 2 Secondary stress & Word boundary

<prosody>

The prosody element works as expected for this engine.

You can specify a preset rate (“fast”, “medium”, “slow”, or “default”). However, using a preset rate is not recommended because it either sets the voice rate to too slow or too fast.

The “rate” attribute can also be set to an integer value such as “100.0” or “50.0”. A normal voice rate should be set to around “150.0”. These values are not in accordance with the SSML spec, where rates are specified relative to 1.

Additionally, you can also adjust the voice rate by using percentages. To increase the rate you could type ”+50%“ to make the voice rate 50% faster or ”-50%“ to make the voice rate 50% slower. Note that the “pitch” attribute does not work for this engine.

<say-as>

The table below shows the <say-as> tag types supported by the AT&T engine.

Say-as Tag Types

AT&T Natural Voices

acronym*

x

address

x

number

x

number: cardinal

x

number: ordinal

number: digits

number: decimal

x

number: fraction

x

number: telephone

x

date

x

date:dmy*

x

date:mdy*

x

date:ymd*

x

date:ym*

date:my*

x

date:md*

x

date:dm*

x

date:y*

x

date:m

date:d

date:day

digits

duration

duration:h

duration:hm

duration:m

duration:ms

duration:s

measure*

x

name

x

net:email

x

net:uri*

x

time*

x

time:h

x

time:hm

x

time:hms

x

spell

telephone*

x

currency*

x

acronym: The acronym tag type works fine in the US, but does not work in the UK. If you are using AT&T Natural Voices and you want to spell out words or say back digits in the UK, you would have to use commas inside of a string such as “a, c, r, o, n, y, m” or “1, 2, 3, 4, 5”.

date:mdy: The preferred format of this tag is “month abbreviation day, year”. For example, to return “December 25, 2001”, you would type “Dec 25, 2001”. You can also use the “month/day/year” format such as “12/25/01” for the US, but this format will not work in the UK.

date:dmy: The preferred format of this tag is “day month abbreviation, year”. For example, to return “December 25, 2001”, you would type “25 Dec, 2001”.

date:ymd: The preferred format for this tag is “year month abbreviation day”. For example, to return “December 25, 2001”, you would type “2001, Dec 25”.

date:my: The format of this tag should be “month abbreviation, year”. For example, to return “December, 2001”, you would type “Dec, 2001”.

date:md: The preferred format for this tag is “month abbreviation day”. For example, to return “December 25”, you would type “Dec 25”. You can also use the “month/day” format such as “12/25” for the US, but this format will not work in the UK.

date:dm: The preferred format for this tag is “day month abbreviation”. For example, to return “December 25”, you would type “25 Dec”.

date:ym: The preferred format for this tag is “year/month”. For example, to return “December 2001”, you would type “2001/12”.

date:y: The date:y tag type works fine in the US, but does not work in the UK.

measure: For AT&T Natural Voices, the preferred format is, for example, 5'4”.

net:uri: For AT&T Natural Voices, the preferred format is www.examplewebsite.com.

time: The time tag type works fine in the US, but does not work in the UK.

telephone: The telephone tag type works fine in the US, but does not work in the UK.

The format for telephone numbers is: 123-456-7890

The format for telephone extensions is: 123-456-7890 ext1234

NOTE: For extensions, AT&T Natural Voices says the number back correctly. In the example above, AT&T Natural Voices will say, “one two three four five six seven eight nine zero, extension one two three four.”

currency: When using the say-as type, currency, for AT&T Natural Voices with a Spanish TTS voice, please keep in mind that you will need to format the currency to $<dollar amount>,<cents amount>. The currency amount will not be pronounced correctly if you format it as $<dollar amount>.<cents amount>.

<sentence>

This attribute is supported.

Last updated