CU VOCAL Features and Licensing Information

 

CU VOCAL is a home-grown Cantonese speech synthesizer developed at the Human-Computer Communications Laboratory () from the Chinese University of Hong Kong ().  CU VOCAL generates highly natural and intelligible synthetic Cantonese speech based on input Chinese text.  It enables dynamic information delivery via a spoken presentation.  CU VOCAL adopts a syllable-based concatenative approach that considers both coarticulatory and tonal contexts.  A text processor is also developed for word segmentation, automatic disambiguation among multiple pronunciations in Cantonese, and mixed language (Chinese and English) handling.

 

Licensing Information:


CU VOCAL is available as evaluation version and commercial version.    Evaluation copy (version 1.2) is a three months trial version and limited features are supported.  Commercial copy (version 1.0) has no time limit and more features are supported.  A consultancy can be set up for commercial parties in order to optimize speech quality.  Consultancy fees are to be negotiated.  For both copies, CU VOCAL includes a set of functions developed in MS Visual C++ 6.0 and it is delivered as dynamic link library (DLL).  Users can make use of the functions to develop their own applications using MS Visual C++ (6.0 or above) or MS Visual Basic (6.0 or above).  CU VOCAL is also available as a SAPI-compliant engine, which facilitate easy integration with other windows applications.

¡@

Supported Features:

 

The following table summaries the features of CU VOCAL.

 

Feature

 

Evaluation License

Commercial License

 

 

 

 

Input Text

 

 

 

Big5 codes

 

10,419 characters

195 symbols

10,419 characters

195 symbols

HKSCS 3

 

Supported

Supported

English Text

 

Spell out

Spell out

 

 

 

 

Voice

 

Female

 Female

 

 

 

 

Fault

Tolerance

 

Skip problematic characters

Skip problematic characters

 

 

 

 

Output

 

Playback

WAV file

n            Sampling rate: 8 / 16 KHz

n            Sample size   : 8 / 16 Bit

n            Format

u           PCM

u           MULAW

u           ALAW

 Playback

 WAV file

n            Sampling rate: 8 / 16 KHz

n            Sample size   : 8 / 16 Bit

n            Format

u           PCM

u           MULAW

u           ALAW

Speed Control

 

supported at article level
(same speed over the whole input article)

Supported at character level

 

 

 

 

SSML ¡@ not supported supported tags:
<prosody>, <emphasis>           
Text Processing ¡@ not configurable free evaluation license of CU Text Processing Resource (3 months)

 

¡@

System Requirements

 

Operating Systems

CU VOCAL can be run in Microsoft Windows-based systems including Windows 98, Windows NT, Windows 2000 and Windows XP. 

 

Disk Requirements

CU VOCAL and all its associated linguistic data and voice libraries require approximately 100 MB of disk space.  Extra disk space should be reserved for temporary WAV files and error log files created during run-time use. 

 

Memory Requirements

CU VOCAL requires approximately 10 MB physical RAM.  We recommend at least 128 MB of memory to provide better performance and facilitate invoking CU VOCAL with multi-threads.

 

Technical and Sales Supports

 

Technical and sales supports are available through e-mail or by telephone.
E-mail: cuvocal@se.cuhk.edu.hk

Phone:(852) 3163-4073
¡@

For licensing issues, please send e-mail to cuvocal@se.cuhk.edu.hk
Please specify technical support or licensing support in e-mail header.
¡@

Remarks: As a project group in university, we do not have dedicated support team.  
We will try to provide the above support with minimized turnaround time given our limited resources.

¡@