New Features of CU VOCAL

Mar 2004: CU VOCAL text-to-speech engine now supports SSML 1.0

- support tags in “Prosody and Style” category
- <prosody> : controls the pitch, speaking rate and volume of the speech output
- <emphasis> : requests that the contained text be spoken with emphasis

Example Sound Clips generated with CU VOCAL:

<prosody>

attribute of
<prosody> tag   
sample sentences
pitch [play] 我是<prosody pitch='x-high'>合唱團的女高音</prosody>
[play] 我是<prosody pitch='x-low'>合唱團的女低音</prosody>
rate [play] 我有緊要事要走先, 你地慢慢傾
[play] <prosody rate='fast'>我有緊要事要走先, 你地<prosody rate='x-slow'>慢慢傾</prosody></prosody>
volume [play] 現在已經係夜深, 請將音量收細
[play] 現在已經係夜深, <prosody volume='x-soft'>請將音量收細</prosody>

<emphasis>

attribute of
<emphasis> tag

sample sentences

level [play] 我宜家好肚餓
[play] 我宜家<emphasis level='strong'>好肚餓</emphasis>

 

integrated example of the use of <prosody> and <emphasis>:

[play]

事關4月差不多 所有 數碼相機都以500萬像素 推出新型號,所以     500萬像素以下機種 便要 大出血 其實 三百、四百萬像素 在日常生已足夠應用買平機是時候

corresponding SSML tags:

事關4月差不多 <emphasis level='strong'> 所有 </emphasis> 數碼相機都以 <prosody pitch='x-high'> 500萬像素 </prosody>推出新型號,所以<prosody rate='1.4'>500萬像素以下機種</prosody>便要<prosody pitch='x-low' rate='x-slow'>大出血</prosody>
其實 <prosody volume='loud'>三百、四百萬像素</prosody> 在日常生活已<prosody rate='slow'>足夠應用</prosody><emphasis level='strong'>買平機是時候</emphasis>

* equivalent features are supported in the SAPI version of CU VOCAL with SAPI XML.