********************************************************************


                                                     Seminar

             Department of Systems Engineering and Engineering Management
                                  The Chinese University of Hong Kong

------------------------------------------------------------------------------------------

 

 

 

Title

:

Towards Accurate and Efficient Classification: A Discriminative and Frequent Pattern-based Approach

 

 

 

Speaker

:

Hong Cheng

 

 

Department of Computer Science

 

 

University of Illinois at Urbana-Champaign

 

 

 

Date

:

February 25th, 2008 (Monday)

 

 

 

Time

:

11:30 a.m. - 12:30 p.m.

 

 

 

Venue

:

Room 513

 

 

William M.W. Mong Engineering Building

 

 

(Engineering Building Complex Phase 2)

 

 

CUHK

 

 

 

------------------------------------------------------------------------------------------

Abstract:
 

Abstract: Classification is an essential theme widely studied in machine learning, statistics, and data mining. A lot of classification methods have been proposed in literature, most of which assume that the input data is in a feature vector representation. However, in many applications, it is desirable to construct accurate classification models on complex structural data which has no initial feature vector representation, including transactions, sequences, graphs, semi-structured data, and texts. A primary question is how to construct a discriminative and compact feature set, on the basis of which, classification could be performed directly. A concrete example is classifying chemical compounds to various classes (e.g., toxic vs. nontoxic, active vs. inactive). While simple features such as atoms and links are too simple to preserve the structural information, graph kernels make it hard to interpret the classifiers.

My goal is to use discriminative frequent patterns to characterize complex structural data and thus enhance the classification power. Theoretical analysis is provided to justify the discriminative power of frequent patterns. Two efficient search strategies have also been designed to directly mine the most discriminative patterns. Based on these results, I developed a framework of discriminative frequent pattern-based classification which could lead to a highly accurate, efficient and interpretable classifier on complex data. The proposed pattern-based classification has been demonstrated useful in applications such as chemical compound classification, text categorization as well as software engineering.


-------------------------------------------------------------------------------------------

Biography:
 

Hong Cheng is currently a Ph.D. candidate in the Department of Computer Science, at University of Illinois at Urbana-Champaign. She got her M. Phil degree from Hong Kong University of Science and Technology in 2003 and B.S. degree from Zhejiang University in 2001, both in Computer Science. Her research interests include data mining, machine learning and database systems. She has published over 20 research papers in international conferences, journals and book chapter, including SIGKDD, SDM, VLDB, ICDE, ICDM, ACM Transactions on KDD, and Data Mining and Knowledge Discovery, and received research paper awards at ICDE 07, SIGKDD 06 and SIGKDD 05.


************************* ALL ARE WELCOME ************************

 

 

 

Host

:

Prof. Chen Nan

Tel

:

(852) 2609-8237

Email

:

nchen@se.cuhk.edu.hk

 

 

 

Enquiries

:

Prof. Nan Chen or Prof. Sean X. Zhou

 

:

Department of Systems Engineering and Engineering Management

 

 

CUHK

Website

:

http://www.se.cuhk.edu.hk/~seg5810

Email

:

seg5810@se.cuhk.edu.hk

 

 

 

********************************************************************