CU VOCAL Features and Licensing Information

CU VOCAL is a home-grown Cantonese speech synthesizer developed at the Human-Computer Communications Laboratory () from the Chinese University of Hong Kong (). CU VOCAL generates highly natural and intelligible synthetic Cantonese speech based on input Chinese text. It enables dynamic information delivery via a spoken presentation. CU VOCAL adopts a syllable-based concatenative approach that considers both coarticulatory and tonal contexts. A text processor is also developed for word segmentation, automatic disambiguation among multiple pronunciations in Cantonese, and mixed language (Chinese and English) handling.

Licensing Information:

CU VOCAL is available as evaluation version and commercial version. Evaluation copy (version 1.2) is a three months trial version and limited features are supported. Commercial copy (version 1.0) has no time limit and more features are supported. A consultancy can be set up for commercial parties in order to optimize speech quality. Consultancy fees are to be negotiated. For both copies, CU VOCAL includes a set of functions developed in MS Visual C++ 6.0 and it is delivered as dynamic link library (DLL). Users can make use of the functions to develop their own applications using MS Visual C++ (6.0 or above) or MS Visual Basic (6.0 or above). CU VOCAL is also available as a SAPI-compliant engine, which facilitate easy integration with other windows applications.

Supported Features:

The following table summaries the features of CU VOCAL.

Feature	Evaluation License	Commercial License

Input Text
Big5 codes	10,419 characters 195 symbols	10,419 characters 195 symbols
HKSCS ³	Supported	Supported
English Text	Spell out	Spell out

Voice	Female	Female

Fault Tolerance	Skip problematic characters	Skip problematic characters

Output	Playback WAV file n Sampling rate: 8 / 16 KHz n Sample size : 8 / 16 Bit n Format u PCM u MULAW u ALAW	Playback WAV file n Sampling rate: 8 / 16 KHz n Sample size : 8 / 16 Bit n Format u PCM u MULAW u ALAW

Speed Control	supported at article level (same speed over the whole input article)	Supported at character level

SSML	not supported	supported tags: <prosody>, <emphasis>

Text Processing	not configurable	free evaluation license of CU Text Processing Resource (3 months)

System Requirements

Operating Systems

CU VOCAL can be run in Microsoft Windows-based systems including Windows 98, Windows NT, Windows 2000 and Windows XP.

Disk Requirements

CU VOCAL and all its associated linguistic data and voice libraries require approximately 100 MB of disk space. Extra disk space should be reserved for temporary WAV files and error log files created during run-time use.

Memory Requirements

CU VOCAL requires approximately 10 MB physical RAM. We recommend at least 128 MB of memory to provide better performance and facilitate invoking CU VOCAL with multi-threads.

Technical and Sales Supports

Technical and sales supports are available through e-mail or by telephone.
E-mail: cuvocal@se.cuhk.edu.hk

Phone:(852) 3163-4073
　

For licensing issues, please send e-mail to cuvocal@se.cuhk.edu.hk
Please specify technical support or licensing support in e-mail header.
　

Remarks: As a project group in university, we do not have dedicated support team.
We will try to provide the above support with minimized turnaround time given our limited resources.