CU VOCAL is a
home-grown Cantonese speech synthesizer developed at the Human-Computer
Communications Laboratory ()
from the Chinese University of Hong Kong (
).
CU VOCAL generates highly natural and intelligible synthetic Cantonese speech
based on input Chinese text. It enables dynamic information delivery via
a spoken presentation. CU VOCAL adopts a syllable-based concatenative
approach that considers both coarticulatory and tonal contexts. A text
processor is also developed for word segmentation, automatic disambiguation
among multiple pronunciations in Cantonese, and mixed language (Chinese and
English) handling.
CU VOCAL is available as
evaluation version and commercial version.
Evaluation
copy (version 1.2) is a three months
trial version and limited
features are supported.
Commercial
copy (version 1.0) has no time limit and more features are supported.
A consultancy can be set up for commercial parties in order to optimize speech
quality. Consultancy fees are to be negotiated. For both copies,
CU VOCAL includes a set of functions developed in MS Visual C++ 6.0 and it is
delivered as dynamic link library (DLL). Users can make use of the
functions to develop their own applications using MS Visual C++ (6.0 or above)
or MS Visual Basic (6.0 or above). CU VOCAL is also available as a
SAPI-compliant engine, which facilitate easy integration with other windows
applications.
¡@
The following table summaries the features of CU VOCAL.
Feature |
|
Evaluation License |
Commercial License |
|
|
|
|
Input Text |
|
|
|
Big5 codes |
|
10,419 characters 195 symbols |
10,419 characters 195 symbols |
HKSCS 3 |
|
Supported |
Supported |
English Text |
|
Spell out |
Spell out |
|
|
|
|
Voice |
|
Female |
Female |
|
|
|
|
Fault Tolerance |
|
Skip problematic characters |
Skip problematic characters |
|
|
|
|
Output |
|
Playback WAV file n Sampling rate: 8 / 16 KHz n Sample size : 8 / 16 Bit n Format u PCM u MULAW u ALAW |
Playback WAV file n Sampling rate: 8 / 16 KHz n Sample size : 8 / 16 Bit n Format u PCM u MULAW u ALAW |
Speed Control |
|
supported at article level |
Supported at character level |
|
|
|
|
SSML | ¡@ | not supported |
supported tags: <prosody>, <emphasis> |
Text Processing | ¡@ | not configurable | free evaluation license of CU Text Processing Resource (3 months) |
¡@
Operating Systems
CU VOCAL can be run in Microsoft Windows-based systems including Windows 98, Windows NT, Windows 2000 and Windows XP.
Disk Requirements
CU VOCAL and all its associated linguistic data and voice libraries require approximately 100 MB of disk space. Extra disk space should be reserved for temporary WAV files and error log files created during run-time use.
Memory Requirements
CU VOCAL requires approximately 10 MB physical RAM. We recommend at least 128 MB of memory to provide better performance and facilitate invoking CU VOCAL with multi-threads.
Technical and
sales supports are available through e-mail or by telephone.
E-mail:
cuvocal@se.cuhk.edu.hk
Phone:(852)
3163-4073
¡@
For licensing
issues, please send e-mail to
cuvocal@se.cuhk.edu.hk
Please specify technical support or licensing support in e-mail header.
¡@
Remarks: As a project group in
university, we do not have dedicated support team.
We will try to provide the above support with minimized turnaround time given
our limited resources.
¡@