AT&T Natural Voices
Voice Tag Attributes
<gender>
This attribute is supported.
<age>
This attribute is not supported.
<name>
If you have an onsite system, please contact your sales account manager for which of these voices you have installed on your server.
Language | Name | Gender | US | UK |
American English (en_us) | Mel | male | x | - |
American English (en_us) | Mike | male | x | x |
American English (en_us) | Ray | male | x | x |
American English (en_us) | Rich | male | x | - |
American English (en_us) | Claire | female | x | - |
American English (en_us) | Crystal | female | x | x |
American English (en_us) | Julia | female | x | - |
American English (en_us) | Lauren | female | x | x |
Spanish (es_us) | Alberto | male | x | - |
Spanish (es_us) | Rosa | female | x | - |
British English (en_uk) | Charles | male | x | x |
British English (en_uk) | Anjali | female | x | - |
British English (en_uk) | Audrey | female | x | x |
French (fr_fr) | Alain | male | x | - |
French (fr_fr) | Juliette | female | x | - |
German (de_de) | Reiner | male | x | x |
German (de_de) | Claudia | female | x | x |
If no name is specified, mike is the default voice for the US AT&T Natural Voices while charles is the default voice for the UK AT&T Natural Voices.
<speak>
The <speak>
tag should be used to specify the desired language through the attribute xml:lang=”lg-CN”, where lg-CN is the language-country pair specified in the Language column from the table of supported languages.
Please note that each voice has an associated language. Selecting a language that is not associated with the voice will result in unpredictable behavior; however, in many cases, you will hear the language the text was written in accented by that voice’s associated language.
<voice>
The <voice>
tag should be used to specify the desired voice through the attribute name=”name”, where name is the voice specified in the Name/ID column for the table of supported voices above.
If you want to use another voice within an app, specify it using the <speak>
and <voice>
tags within the prompt block:
To sequentially use multiple languages and voices within a <prompt>
block, use multiple <speak>
and <voice>
blocks. For example:
<xml:lang>
If you have an onsite system, please contact your sales account manager for which of these languages you have installed on your server.
The following languages are supported by their respective engines:
Language | Code | US | UK |
German | de_de | X | X |
British English | en_uk | X | X |
American English | en_us | X | X |
Spanish | es_es | X | X |
French | fr_fr | X | X |
For example, use <voice xml:lang="en_us">
to hear an American speaker.
SSML Tags
An “x” marks that the Child Tag is supported by the speech engine. An asterisk (*) means that there are notes to explain the difference between the speech engines.
Child Tag | AT&T Natural Voices |
<break>* | x |
<emphasis> | |
<enumerate> | x |
<mark> | |
<paragraph>* | x |
<phoneme>* | x |
<prosody>* | x |
<say-as>* | x |
<sentence>* | x |
<speak> | x |
<sub> | x |
<value> | x |
<break>
This tag works as expected for the AT&T engine.
<paragraph>
This tag works as expected for the AT&T engine.
<phoneme>
This tag works uses the Phoneme Set shown below.
Phoneme Set for AT&T Natural Voices:
Phoneme | Example | Transcription |
aa | Bob | b aa b 1 |
ae | bat | b ae t 1 |
ah | but | b ah t 1 |
ao | bought | b ao t 1 |
aw | down | d aw n 1 |
ax | about | ax 0 b aw t 1 |
ay | bite | b ay t 1 |
b | bet | b eh t 1 |
ch | church | ch er ch 1 |
d | dig | d ih g |
dh | that | dh ae t 1 |
dx | butter | b ah 1 dx er 0 |
eh | bet | b eh t 1 |
em | Chatham | ch ae 1 dx em 0 |
en | satin | s ae 1 q en 0 |
er | bird | b er d 1 |
ey | bait | b ey t 1 |
f | fog | f ao g 1 |
g | got | g aa t 1 |
hh | hot | h aa t 1 |
ih | bit | b ih t 1 |
iy | beat | b iy t 1 |
jh | jump | jh ah m p 1 |
k | cat | k ae t 1 |
l | lot | l aa t 1 |
m | Mom | m aa m 1 |
n | nod | n aa d 1 |
ng | sing | s ih ng 1 |
ow | boat | b ow t 1 |
oy | boy | b oy 1 |
p | pot | p aa t 1 |
q | button | b ah 1 q en 0 |
r | rat | r ae t 1 |
s | sit | s ih t 1 |
sh | shut | sh ah t 1 |
t | top | t aa p 1 |
th | thick | th ih k 1 |
uh | book | b uh k |
uw | boot | b uw t 1 |
v | vat | v ae t 1 |
w | won | w ah n 1 |
y | you | y uw 1 |
z | zoo | z uw 1 |
zh | measure | m eh 1 zh er |
Key
0 Unstressed 1 Primary stress 2 Secondary stress & Word boundary
<prosody>
The prosody element works as expected for this engine.
You can specify a preset rate (“fast”, “medium”, “slow”, or “default”). However, using a preset rate is not recommended because it either sets the voice rate to too slow or too fast.
The “rate” attribute can also be set to an integer value such as “100.0” or “50.0”. A normal voice rate should be set to around “150.0”. These values are not in accordance with the SSML spec, where rates are specified relative to 1.
Additionally, you can also adjust the voice rate by using percentages. To increase the rate you could type ”+50%“ to make the voice rate 50% faster or ”-50%“ to make the voice rate 50% slower. Note that the “pitch” attribute does not work for this engine.
<say-as>
The table below shows the <say-as>
tag types supported by the AT&T engine.
Say-as Tag Types | AT&T Natural Voices |
acronym* | x |
address | x |
number | x |
number: cardinal | x |
number: ordinal | |
number: digits | |
number: decimal | x |
number: fraction | x |
number: telephone | x |
date | x |
date:dmy* | x |
date:mdy* | x |
date:ymd* | x |
date:ym* | |
date:my* | x |
date:md* | x |
date:dm* | x |
date:y* | x |
date:m | |
date:d | |
date:day | |
digits | |
duration | |
duration:h | |
duration:hm | |
duration:m | |
duration:ms | |
duration:s | |
measure* | x |
name | x |
net:email | x |
net:uri* | x |
time* | x |
time:h | x |
time:hm | x |
time:hms | x |
spell | |
telephone* | x |
currency* | x |
acronym: The acronym tag type works fine in the US, but does not work in the UK. If you are using AT&T Natural Voices and you want to spell out words or say back digits in the UK, you would have to use commas inside of a string such as “a, c, r, o, n, y, m” or “1, 2, 3, 4, 5”.
date:mdy: The preferred format of this tag is “month abbreviation day, year”. For example, to return “December 25, 2001”, you would type “Dec 25, 2001”. You can also use the “month/day/year” format such as “12/25/01” for the US, but this format will not work in the UK.
date:dmy: The preferred format of this tag is “day month abbreviation, year”. For example, to return “December 25, 2001”, you would type “25 Dec, 2001”.
date:ymd: The preferred format for this tag is “year month abbreviation day”. For example, to return “December 25, 2001”, you would type “2001, Dec 25”.
date:my: The format of this tag should be “month abbreviation, year”. For example, to return “December, 2001”, you would type “Dec, 2001”.
date:md: The preferred format for this tag is “month abbreviation day”. For example, to return “December 25”, you would type “Dec 25”. You can also use the “month/day” format such as “12/25” for the US, but this format will not work in the UK.
date:dm: The preferred format for this tag is “day month abbreviation”. For example, to return “December 25”, you would type “25 Dec”.
date:ym: The preferred format for this tag is “year/month”. For example, to return “December 2001”, you would type “2001/12”.
date:y: The date:y tag type works fine in the US, but does not work in the UK.
measure: For AT&T Natural Voices, the preferred format is, for example, 5'4”.
net:uri: For AT&T Natural Voices, the preferred format is www.examplewebsite.com
.
time: The time tag type works fine in the US, but does not work in the UK.
telephone: The telephone tag type works fine in the US, but does not work in the UK.
The format for telephone numbers is: 123-456-7890
The format for telephone extensions is: 123-456-7890 ext1234
NOTE: For extensions, AT&T Natural Voices says the number back correctly. In the example above, AT&T Natural Voices will say, “one two three four five six seven eight nine zero, extension one two three four.”
currency: When using the say-as type, currency, for AT&T Natural Voices with a Spanish TTS voice, please keep in mind that you will need to format the currency to $<dollar amount>,<cents amount>. The currency amount will not be pronounced correctly if you format it as $<dollar amount>.<cents amount>.
<sentence>
This attribute is supported.
Last updated