Tutorials



APWeb 2013 features three tutorials, as shown below.

Tutorial 1

Speaker: Haixun Wang, Microsoft Resarch Asia

Title: Understanding Short Texts

Abstract: One of the biggest challenges for Ads selection is that it is difficult to evaluate the semantic similarity between a search query and an ad. Clearly, traditional edit distance based string similarity does not work. Moreover, statistical methods that find latent topic models from text also fall short because ads and search queries are insufficient to provide enough statistical signals.
In this tutorial, I will talk about a knowledge empowered approach for text understanding. When the input is sparse, noisy, and ambiguous, knowledge is needed to fill the gap in understanding. I will introduce the Probase project at Microsoft Research Asia, whose goal is to enable machines to understand human communications. Probase is a universal, probabilistic taxonomy more comprehensive than any current taxonomy. It contains more than 2 million concepts, harnessed automatically from a corpus of 1.68 billion web pages and two years' worth of search-log data. It enables probabilistic interpretations of search queries, document titles, ad keywords, etc. The probabilistic nature also enables it to incorporate heterogeneous information naturally. I will explain how the core taxonomy, which contains hypernym-hyponym relationships, is constructed and how it models knowledge's inherent uncertainty, ambiguity, and inconsistency.

Bio: Haixun Wang is a senior researcher at Microsoft Research Asia in Beijing, China, where he manages the group of Data Management, Analytics, and Services. Before joining Microsoft, he had been a research staff member at IBM T. J. Watson Research Center for 9 years. He was Technical Assistant to Stuart Feldman (Vice President of Computer Science of IBM Research) from 2006 to 2007, and Technical Assistant to Mark Wegman (Head of Computer Science of IBM Research) from 2007 to 2009. Haixun Wang has published more than 120 research papers in referred international journals and conference proceedings. He is on the editorial board of Distributed and Parallel Databases (DAPD), IEEE Transactions of Knowledge and Data Engineering (TKDE), Knowledge and Information System (KAIS), Journal of Computer Science and Technology (JCST). He is PC co-Chair of WWW 2013 (P&E), ICDE 2013 (Industry), CIKM 2012, ICMLA 2011, WAIM 2011. Haixun Wang got the ER 2008 Conference best paper award (DKE 25 year award), and ICDM 2009 Best Student Paper run-up award.

Tutorial 2



Speakers:

Title: Search on Graphs: Theory Meets Engineering

Abstract: The last decade has witnessed an explosion of the availability of and interest in graph structured data. The desire to search and reason over these increasingly massive data collections pushes the boundaries of search languages, from pure keyword search to structure-aware searches in the graph. These phenomena have inspired a rich body of research on query languages, data management and query evaluation techniques for graph data, both from the theoretical and engineering angles. In this tutorial, we present an overview of the progress on graph search queries, focusing specifically on how the theoretical and engineering perspectives meet and together advanced the field.

Bio:

Prof. Wu is an Associate Professor at School of Informatics and Computing, Indiana University, Bloomington, USA. Prof. Wu received her Ph.D. degree from University of Michigan, Ann Arbor, in 2004. Her research area is in data management, especially semi-structure and non-structured data, with an emphasis on query language, query processing and query optimization.

Dr. Fletcher is an Assistant Professor in the Databases and Hypermedia group at the Eindhoven University of Technology, The Netherlands. Dr. Fletcher was awarded a doctorate in computer science from Indiana University, Bloomington (2007), with a dissertation on the topic of query learning for data integration. His current research focuses on the study of database query languages for data integration and web data.


Tutorial 3

Speaker: Lei Chen, Hong Kong University of Science and Technology

Title: Managing the Wisdom of Crowds on Social Media Services

Abstract: Recently, the "Wisdom of Crowds"' has attracted a huge amount of interests from both research and industrial communities. For the current stage, most of focus has been put on several specific crowdsourcing marketplaces like Amazon MTurk or CrowdFlower, on which "requesters" publish tasks and "workers" select tasks according to their own benefits. However, users on social media services can also serve as candidate "workers" for crowdsourcing tasks, and it is possible for the "requesters" to actively manage the quality and cost of such crowds.
In this tutorial, we will first review the basic concept of crowdsourcing and its applications. Then, we will discuss the current popular crowdsourcing platforms and several interesting crowdsourcing related algorithms. Finally, we will discuss the benefits of using social media services as the crowdsourcing platform and propose several research challenges on managing the wisdom of crowds on social media.

Bio: Lei Chen received the BS degree in Computer Science and Engineering from Tianjin University, Tianjin, China, in 1994, the MA degree from Asian Institute of Technology, Bangkok, Thailand, in 1997, and the PhD degree in computer science from the University of Waterloo, Waterloo, Ontario, Canada, in 2005. He is currently an Associate Professor in the Department of Computer Science and Engineering, Hong Kong University of Science and Technology. His research interests include crowd sourcing on social media, social media analysis, probabilistic and uncertain databases, and privacy-preserved data publishing. So far, he has published nearly 200 conference and journal papers. He got the Best Paper Awards in DASFAA 2009 and 2010. He is PC Track Chairs for VLDB 2014, ICDE 2012, CIKM 2012, SIGMM 2011. He has served as PC members for SIGMOD, VLDB, ICDE, SIGMM, and WWW. Currently, he serves as an Associate Editor for IEEE Transaction on Data and Knowledge Engineering and Distribute and Parallel Databases. He is a member of the ACM and the chairman of ACM Hong Kong Chapter.