Tutorials are available at no extra cost for conference participants.

The Theory and Application of Latent Topic Models

Sunday, December 2, 09:00 - 11:15

Timothy J. Hazen
MIT Lincoln Laboratory

Latent topic modeling refers to a class of techniques in which hidden (or latent) semantic concepts (or topics) can be discovered in an unsupervised fashion from a collection of documents. Latent topic modeling techniques can provide an effective means for improving a wide variety of text and speech applications including document clustering, document link detection, query-by-example document retrieval, document summarization, corpus summarization, and automatic speech recognition.

This tutorial will focus on the most prominent latent topic modeling approaches: latent semantic analysis (LSA), probabilistic latent semantic analysis (PLSA), and latent Dirichlet allocation (LDA). The basic theory behind each approach will be presented first, followed by an overview of the engineering techniques required to perform the unsupervised learning of latent topics. In particular, the standard EM, variational EM, and Gibbs sampling algorithms for unsupervised training will all be reviewed.

The tutorial will review a variety of applications of topic modeling including the document similarity assessment, document retrieval, document summarization, corpus summarization and automatic speech recognition. Methods for improving topics model through the incorporation of multi-word phrases or n-gram dependency structures will be discussed. The conclusion of the tutorial will highlight extensions of the traditional latent topic models such as the author-topic model and the community-topic model.

Title: Statistical Dialogue Management for Conversational Spoken Interfaces: How, Why and Scaling-up.

Sunday, December 2, 13:00 - 15:15

Paul A. Crook

Recent commercial systems, such as Siri, have significantly increased public awareness of spoken dialogue systems (SDS). However such commercial system tend to sidestep the problem of maintaining a conversation, instead adopting a one-shot question-answer, command-control or voice-search approach.

Even in relatively simple tasks, such as flight booking or providing tourist information, the ability to maintain a conversation has significant advantages. Ambiguities (either due to noisy recognition or confusable semantics) can be resolved through judicious selection of speech acts by the SDS Dialogue Manager (SDS DM).

Statistical models provide a particularly attractive approach to building conversational systems. The different types of uncertainty that arise, e.g. uncertainty associated with recognition or identifying the context, goals or conversation state, can be directly modelled. Information arising from different turns in the dialogue can be integrated and the conversation's state updated in a principled manner. There is also no need to design cumbersome sub-dialogues to recover from dialogue mistakes. A statistical model naturally integrates any negation, rejection or correction into it's update of the conversation state.

Recent research has demonstrated the expected improvements in robustness when using statistical models for dialogue management. These models being coupled with techniques for automatic policy optimisation. However, scaling up to tackle real world problem is still an area of active research.

This tutorial will provide an introduction to statistical dialogue management and outline the advantages that statistical models provide. It will look at current statistical models found in SDS DM research, e.g. Bayesian networks and POMDPs. Illustraite how such systems are created and policies trained using both POMDP planning and the latest sample efficient Reinforcement Learning techniquea. It will finishing with a discussion of current work on scaling up such models to handle sufficent contextual information that real world problems can be tackled while retaining robustness, (e.g. Mixture-Model POMDPs, Summary Spaces, Automatic Belief Compression and large scale POMDP solvers).

Statistical Language Modeling Turns Thirty-Something: Are We Ready To Settle Down?

Sunday, December 2, 15:45 - 18:00

Sanjeev P. Khudanpur
Associate Professor,
The Johns Hopkins University

It has been more than a decade since Roni Rosenfeld described "Two Decades of Statistical Language Modeling: Where Do We Go From Here?" in August 2000, in a special issue of the Proceedings of the IEEE on spoken language processing. Perhaps it is time to review what we have learnt in the years since?

This tutorial will begin with what was well known in 2000 --- n-grams, decision tree language models, syntactic language models, maximum entropy (log-linear) models, latent semantic analysis and dynamic adaptation --- and then move on to discuss new techniques that have emerged since, such as models with sparse priors, nonparametric Bayesian methods (including Dirichlet processes), and models based on neural networks, including feed-forward, recurrent and deep belief networks.

In addition to surveying current techniques, a major goal of the tutorial will be to expose the core mathematical/statistical issues in language modeling, and to explain how various competing methods address these issues. It will be argued that the key to solving what appears at first blush to be a hopelessly high-dimensional, sparse-data estimation problem is to structure the model (family) and to guide the choice of parameter values using linguistic knowledge. It is hoped that viewing the core issues in this manner will enable the audience to gain a deeper understanding of the strengths and weaknesses of various approaches.

And, no, we are not yet ready to settle down; at least not all of us. But we now know what we are looking for. It varies from application to application: to each his own!