SEEM Seminar 19 Oct

******************************************************************************

                                    Seminar

       Department of Systems Engineering and Engineering Management,

                    The Chinese University of Hong Kong

------------------------------------------------------------------------------

Title:
A Study on the Verification of Large Vocabulary Continuous Speech Recognition
(LVCSR) Output using Generalized Posterior Probabilities　

Speaker:
Dr. Wai-kit Lo
Systems Engineering & Engineering Management Department
The Chinese University of Hong Kong

Date : October 19, 2005 (Wednesday)

Time : 5:30 p.m. - 6:30 p.m.

Venue : Room 513, William M.W. Mong Engineering Building

(Engineering Building Complex Phase 2), CUHK

Abstract:
Speech is a natural and efficient means of communication for humans. Automatic
speech recognition (ASR) is an essential component in spoken language
interaction between humans and computers. Among the many kinds of ASR systems,
a large vocabulary continuous speech recognition (LVCSR) system converts
spoken input into a string of words. However, the current state-of-the-art
speech recognition technology is still not robust to all kinds of variability
in speech signals, e.g., speakers, channels, noisy environments, etc. In many
applications (e.g., spoken dialogue interface, automatic speech translation,
etc.), hypothesized strings words with a few erroneous words are still useful.

In order the pin-point the correct words from LVCSR output, reliability is
assessed for recognized words to reweight their importance, or to facilitate a
hard acceptance/rejection decision, before further processing. In this talk,
we will focus on a statistical measure of recognition confidence, the
Generalized Posterior Probability (GPP). GPP is a modified version of the
posterior probability of recognition output, given the acoustic observation
and LVCSR. GPP can be computed directly and efficiently by applying the
forward-backward algorithm on the word graph (the search space for recognition
output), which is a by-product of the decoding process in an LVCSR.
Verification using GPP is efficient in the sense that training of additional
models (e.g., anti-models) is not needed.

We will show how to compute the GPP from a word graph and also demonstrate the
robustness of optimized GPP for verification of LVCSR output. GPP verification
is applied to speech input under various different conditions, e.g., speech
under noise and hands-free applications. Our experimental results show that
the use of GPP for verification of recognition output provides consistent
improvement over the baseline in minimizing verification errors. Finally, this
talk will be concluded by extending the GPP approaches to recognition output
at different linguistic levels, such as subword (e.g., character in Chinese),
words, and utterance.

Bio:

Wai-kit Lo received his B.Eng (Hons.), M.Phil. and Ph.D. degrees, all from the
Chinese University of Hong Kong, Shatin, Hong Kong in 1994, 1996 and 2002
respectively.

He was a Project Coordinator at the Department of Electronic Engineering, the
Chinese University of Hong Kong from 1997 to 2002. He has been working on
research and development projects covering various different topics in spoken
language processing, including spoken language corpora, speech synthesis,
speech recognition and spoken document retrieval. In 2000 summer, he
participated in the Summer Research Workshop in Johns Hopkins University and
engaged in the translingual speech retrieval project – MEI: Mandarin-English
Information. From 2002 to 2003, he was a Project Engineer at the Department of
Systems Engineering and Engineering Management, the Chinese University of Hong
Kong, and worked on the Author Once Present Anywhere (AOPA) project,
application of speech technologies to facilitate universal accessibility for
Internet technologies. From 2003 to 2005, he was a researcher at the Spoken
Language Communication Research Laboratories of the Advanced Telecommunications
Research Institute (ATR) in Kyoto, Japan. During the stay in ATR, he has been
working on problems related to statistical confidence measures for rejection of
incorrect outputs in speech recognition system and its application to speech
translation systems. In 2005, he rejoined the Department of Systems Engineering
and Engineering Management as a Research Assistant Professor.

Dr. Lo is a member of the Institute of Electrical and Electronic Engineer (IEEE)
since 1992 and the International Speech Communication Association (ISCA)
since 1997.

______________________________________________________________________________

***** ALL ARE WELCOME *****

Host : Prof. Helen Meng
Tel : 26098327
Email : hmmeng@se.cuhk.edu.hk

For more information please

refer to http://www.se.cuhk.edu.hk/~seg5810/