RealSpeak Engine

Voice Tag Attributes

<gender>

The gender attribute should not be used if the name attribute is already being used for the <voice> tag.

<age>

This attribute is not supported.

<name>

If you have an onsite system, please contact your sales account manager for which of these voices you have installed on your server.

The following names are supported by their respective engines:

Language

Name

Gender

US

UK

American English (en-US)

Tom

male

x

-

American English (en-US)

Jennifer

female

x

-

American English (en-US)

Jill

female

x

-

American English (en-US)

Samantha

female

x

-

Mexican Spanish (es-MX)

Javier

male

x

-

Mexican Spanish (es-MX)

Paulina

female

x

-

British English (en-GB)

Daniel

male

x

x

British English (en-GB)

Emily

female

x

x

Australian English (en-AU)

Lee

male

x

-

Australian English (en-AU)

Karen

female

x

-

Canadian French (fr-CA)

Felix

male

x

-

Canadian French (fr-CA)

Julie

female

x

-

Portuguese (pt-PT)

Madalena

female

x

-

Brazilian Portuguese (pt-BR)

Raquel

female

x

-

German (de-DE)

Yannick

male

-

x

German (de-DE)

Steffi

female

x

x

Spanish (es-ES)

Diego

male

-

x

Spanish (es-ES)

Isabel

female

-

x

French (fr-FR)

Sebastien

male

-

x

French (fr-FR)

Virginie

female

-

x

Italian (it-IT)

Silvia

female

x

x

Dutch (nl-NL)

Claire

female

x

x

Belgian Dutch (nl-BE)

Ellen

female

-

x

Mandarin Chinese (zh-CN)

Mei-Ling

female

x

-

If no name is specified, Jill is the default voice for the US Realspeak Engine while Emily is the default voice for the UK Realspeak Engine.

Please contact your account manager if you want any of the following Realspeak voices:

Language

Name

Gender

Dutch (da-DK)

Nanna

female

Italian (it-IT)

Paolo

male

Indian English (en-IN)

Sangeeta

female

Spanish (es-ES)

Monica

female

Basque (eu-ES)

Arantxa

female

Japanese (ja-JP)

Kyoko

female

Korean (ko-KR)

Narae

female

Norwegian (no-NO)

Nora

female

Polish (pl-PL)

Agata

female

Russian (ru-RU)

Katerina

female

Swedish (sv-SE)

Ingrid

female

Hong Kong Cantonese (zh-HK)

Sin-ji

female

The case-sensitive language code attribute MUST be used along with its corresponding xml:lang attribute if the language is not en-US (American English). For example, to hear the Mexican Spanish voice “Javier”, one must type the following:

<speak xml:lang="es-MX"><voice name="Javier">
¿A ti te gusta el queso fresco?
</voice></speak>

NOTE: For US speech recognition, we currently only offer American English speech recognition, Spanish speech recognition, and French-Canadian speech recognition for Plum DEV. If you are interested in any other speech recognition languages, please contact your sales representative.

NOTE: For UK speech recognition, we currently only offer American English speech recognition and British English speech recognition for Plum DEV. If you are interested in any other speech recognition languages, please contact your sales representative.

<speak>

The <speak> tag should be used to specify the desired language through the attribute xml:lang=”lg-CN”, where lg-CN is the language-country pair specified in the Language column from the table of supported languages.

Please note that each voice has an associated language. Selecting a language that is not associated with the voice will result in unpredictable behavior; however, in many cases, you will hear the language the text was written in accented by that voice’s associated language.

<voice>

The <voice> tag should be used to specify the desired voice through the attribute name=”name”, where name is the voice specified in the Name/ID column for the table of supported voices.

If another voice is desired, it should specified using the <speak> and <voice> tags as follows within the prompt block:

<?xml version="1.0"?>
<vxml version="2.0">
 <form>
  <block>
   <prompt>
    <speak xml:lang="es-MX">
    <voice name="Mia" variant="1">
     Hello, thank you for calling Plum Voice.
    </voice>
    </speak>
   </prompt>
  </block>
 </form>
</vxml>

To sequentially use multiple languages and voices within a <prompt> block, use multiple <speak> and <voice> blocks. For example:

<?xml version="1.0"?>
<vxml version="2.0">
 <form>
  <block>
   <prompt>
    <speak xml:lang="en-US">
    <voice name="Joanna" variant="2">
     Press one to continue in English.
    </voice>
    </speak>
    <speak xml:lang="es-US">
    <voice name="Lupe" variant="2">
     Presione dos para continuar en español.
    </voice>
    </speak> 
    <speak xml:lang="fr-FR">
    <voice name="Celine" variant="standard">
     Appuyez sur trois pour continuer en français.
    </voice>
    </speak>
   </prompt>
  </block>
 </form>
</vxml>

<xml:lang>

If you have an onsite system, please contact your sales account manager for which of these languages you have installed on your server.

Language

Code Value

US

UK

American English

en-US

x

-

Mexican Spanish

es-MX

x

-

Canadian French

fr-CA

x

-

German

de-DE

x

x

British English

en-GB

x

x

French

fr-FR

-

x

Spanish

es-ES

-

x

Belgian Dutch

nl-BE

-

x

Dutch

nl-NL

x

x

Please contact your account manager if you want any of the following Realspeak languages:

Language

Code Value

Danish

da-DK

Swiss German

de-CH

Australian English

en-AU

Indian English

en-IN

Basque

eu-ES

Belgian French

fr-BE

Swiss French

fr-CH

Swiss Italian

it-CHC

Italian

it-IT

Japanese

ja-JP

Korean

ko-KR

Norwegian

no-NO

Polish

pl-PL

Brazilian Portuguese

pt-BR

Portuguese

pt-PT

Russian

ru-RU

Swedish

sv-SE

Mandarin Chinese

zh-CN

Hong Kong Cantonese

zh-HK

Note that different syntax is used for the xml:lang attribute for the RealSpeak Engine. For example, to hear a French speaker you need to use the case-sensitive county code, i.e.,<voice xml:lang=“fr-FR”>.

SSML Tags

An “x” marks that the Child Tag is supported by the speech engine. An asterisk (*) means that there are notes to explain the difference between the speech engines.

Child Tag

RealSpeak Engine

<break>*

x

<emphasis>

<enumerate>

x

<mark>

<paragraph>*

x

<phoneme>*

x

<prosody>*

x

<say-as>*

x

<sentence>*

x

<speak>

x

<sub>

x

<value>

x

<break>

This works as expected.

<paragraph>

This works as expected.

<phoneme>

Phoneme Set for Nuance RealSpeak:

Phoneme

Example

Transcription

i

feel

' f i l

I

fill

' f I l

E

fell

' f E l

@

cat

' k @ t

A

got

' g A t

^

cut

' k ^ t

O

fall

' f O l

U

full

' f U l

u

fool

' f u l

$

allow

$ . ' l a+U

E0

curt

' k E0 R= t

O

door

' d O R=

e+I

fail

' f e+I l

O+I

foil

' f O+I l

a+I

file

' f a+I l

a+U

foul

' f a+U l

o+U

goal

' g o+U l

j

yes

' j E s

w

why

' w a+I

R=

rip

' R= I p

l

lip

' l I p

p

pit

' p I t

t

top

' t A p

k

cat

' k @ t

b

bit

' b I t

d

dig

' d I g

g

got

' g A t

?

eat

' ? i t

f

fat

' f @ t

T

thin

' T i n

s

seal

' s i l

S

ship

' S i p

v

vat

' v @ t

D

then

' D e n

z

zeal

' z i l

Z

leisure

' l iZ $ R=

h

hat

' h @ t

t+S

catch

' k @ t+S

d+Z

journey

' d+Z E0 R= . n i

m

man

' m @ n

n

nut

' n ^ t

nK

ring

' R= I nK

r6

butter

' b ^ . r6 $ R=

i0

sanity

s @ . n i0 . t i

Key

Symbol

Meaning

Example

Transcription

_

word delimeter

nut butter

' n ^ t _ ' b ^ . r6 $ R=

'

primary word stress

record (verb)

R= I . ' k O R= d

'2

secondary word stress

explanation

' 2 E k . s p l $ . ' n e+I . S $ n

sentence accent

There are TWO ACCENTS in this sentence

D E R= _ A R= _ ” t u _ “ @ k . s E n t s _ ? I n _ D I s _ ' s E n . t $ n s

.

syllable boundary

syllable

' s I . l $ . b $ l

#

silence (pause)

I said: don't do it

? a+I ” s E d # d o+U n t “ d u _ I t

<prosody>

When using a Realspeak TTS voice, the talking speed of the TTS voice does not revert back to the normal speed after the <prosody> tag closes.

To revert it back to normal, you must use the <prosody> tag again with the attribute of “volume” set to “100.0” and the attribute of “rate” set to “default”.

Note that this engine does not support the “pitch” attribute.

You cannot specify the “rate” value as an integer using this engine, but percentages and the presets rates (“fast”, “medium”, “slow”, or “default”) work as expected.

<say-as>

The table below shows the <say-as> tag types supported by this engine.

Say-as Tag Types

RealSpeak Engine

acronym*

address

x

number

x

number: cardinal

x

number: ordinal

x

number: digits

x

number: decimal

x

number: fraction

x

number: telephone

x

date

x

date:dmy*

x

date:mdy*

x

date:ymd*

x

date:ym*

x

date:my*

x

date:md*

x

date:dm*

x

date:y*

x

date:m

x

date:d

x

date:day

x

digits

x

duration

x

duration:h

x

duration:hm

x

duration:m

x

duration:ms

x

duration:s

x

measure*

x

name

x

net:email

x

net:uri*

time*

x

time:h

x

time:hm

x

time:hms

x

spell

x

telephone*

x

currency*

x

acronym: The acronym tag type works fine in the US, but does not work in the UK.

date:mdy: The preferred format of this tag is “month abbreviation day, year”. For example, to return “December 25, 2001”, you would type “Dec 25, 2001”. You can also use the “month/day/year” format such as “12/25/01” for the US, but this format will not work in the UK.

date:dmy: The preferred format of this tag is “day month abbreviation, year”. For example, to return “December 25, 2001”, you would type “25 Dec, 2001”.

date:ymd: The preferred format for this tag is “year month abbreviation day”. For example, to return “December 25, 2001”, you would type “2001, Dec 25”.

date:my: The format of this tag should be “month abbreviation, year”. For example, to return “December, 2001”, you would type “Dec, 2001”.

date:md: The preferred format for this tag is “month abbreviation day”. For example, to return “December 25”, you would type “Dec 25”. You can also use the “month/day” format such as “12/25” for the US, but this format will not work in the UK.

date:dm: The preferred format for this tag is “day month abbreviation”. For example, to return “December 25”, you would type “25 Dec”.

date:ym: The preferred format for this tag is “year/month”. For example, to return “December 2001”, you would type “2001/12”.

date:y: The date:y tag type works fine in the US, but does not work in the UK.

measure: For Realspeak, either format, 5'4“ or 5m, will work.

net:uri: The RealSpeak Engine does not read back URLs correctly and will play a web address as 'www point examplewebsite point com'.

time: The time tag type works fine in the US, but does not work in the UK.

telephone: The telephone tag type works fine in the US, but does not work in the UK.

The format for telephone numbers is: 123-456-7890

The format for telephone extensions is: 123-456-7890 ext1234

NOTE: For extensions, Realspeak will say the number back correctly. In the example above, Realspeak will say, “one two three four five six seven eight nine zero, extension one two three four.”

currency: When using the say-as type, currency, for AT&T Natural Voices with a Spanish TTS voice, please keep in mind that you will need to format the currency to $<dollar amount>,<cents amount>. The currency amount will not be pronounced correctly if you format it as $<dollar amount>.<cents amount>.

<sentence>

This functions as expected.

Last updated