577 - Seminar in AI: Advanced Topics in Pattern Recognition
Probability Models for Information Processing and Machine Perception

Lectures: Thursdays 2:00-4:40 in CSB 632




Course Description

This seminar will cover advanced probabilistic models and methods for pattern recognition and machine learning.  The course will focus on models and algorithms but will go into application domain details required to understand state of the art techniques. Application areas will be selected from: computer vision and computational photography (~50%), text and natural language processing (~30%), bioinformatics and computational biology (~20%). There will be two assignments covering fundamental material. The bulk of the course will consist of reading and discussing recent research papers and presenting a project.








Topics: My Lectures, Readings and Student Presentations


Jan. 24


Introduction & course overview


Jan. 31


Introduction to probability models

Mixture Models

The EM Algorithm

Introduction to Graphical Models


Required Reading:

Bilmes, J. A Gentle Tutorial on the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models. Technical Report, UC Berkeley, ICSI-TR-97-021, 1997.

B. J. Frey and N. Jojic, Estimating mixture models of images and inferring spatial transformations using the EM algorithm, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 1999, pp. 416-422.


Student Presentation: Ross Messing - Transformation Invariant Mixture Models



Feb. 7


Inference and Estimation

Factor Graphs

Message Passing

MAP and MPE inference


Required Reading:

Kschischang, Frey and Loeliger (2001) Factor Graphs and the Sum-Product Algorithm. IEEE Transactions on Information Theory.


Additional Reading:

Jojic, Petrovic, Frey and Huang. Transformed Hidden Markov Models: Estimating Mixture Models of Images and Inferring Spatial Transformations in Video Sequences. IEEE CVPR 200.


Student Presentation: Paul Ardis - Factor Graphs and the Sum Product Algorithm



Feb. 14


Approximate Inference Techniques

Graph Cuts
Variational Methods




Kolmogorov and Zabih, What Energy Functions can be Minimized via Graph Cuts? ECCV '02/PAMI '04.


Additional Material:

* Boykov, Veksler and Zabih Fast Approximate Energy Minimization via Graph Cuts PAMI '01.

* Greig, Prteous and Seheuth. "Exact Maximum A Posteriori Estimation for Binary Images" J. Royal Statistical Soc., Series B, vol. 51, no. 2, pp. 271-279, 1989.

* The Cornell Graph Cuts Page


Student Presentation: Satyaki Mahalanabis - Graph Cuts



Feb. 21


Bayesian Methods

Exponential Families

Variational Bayes



Blei, Ng and Jordan. Latent Dirichlet Allocation. JMLR 2003.


Student Presentation: Nick Morsillo - LDA

Student Presentation: Bin Wei - Learning Bayesian Networks
(Note: Second half of this presentation will be given on Feb. 28 due to the fire alarm.)



Feb. 28


More On Bayesian Methods

Markov Chain Monte Carlo

Hierarchical Bayesian Methods




Yee Whye Teh, Michael I Jordan, Matthew J. Beal and David M. Blei (2004) Sharing Clusters among Related Groups: Hierarchical Dirichlet Processes. NIPS (earlier version of the JASA paper).


Beal, Ghahramani and Rasmussen (2002) The Infinite Hidden Markov Model. NIPS 14.


Yee Whye Teh, Michael I Jordan, Matthew J. Beal and David M. Blei (2006) Hierarchical Dirichlet Processes, JASA.


Student Presentation: Ross Messing - Hierarchical Dirichlet Processes and Time Series Models.



March 6


Conditional Probability Models

Multinomial logistic regression and optimization techniques
Kernel MLR, Relevance Vector Machines
Conditional mixture models


Readings (Tentative):

Zhu and Hastie (2002) Kernel Logistic Regression and the Import Vector Machine NIPS 14.


Student Presentation: Nick Morsillo - Kernel LR.



March 13


No class / Spring Break



March 20


Undirected Models I - MRFs and CRFs

History and Fundamentals

Their use in:

Computer Vision - Image Segmentation

Computational Photography - Focusing on CRFs for Stereo

Natural Language Processing - Information Extraction



Charles Sutton and Andrew McCallum. (2006) An Introduction to Conditional Random Fields for Relational Learning. In Introduction to Statistical Relational Learning. Edited by Lise Getoor and Ben Taskar. MIT Press. 2006.


Additional Reading:

Daniel Scharstein and Chris Pal (2007) Learning Conditional Random Fields for Stereo In the proceedings of IEEE Computer Vision and Pattern Recognition (CVPR).


Student Presentation: Paul Ardis - CRFs



March 27


Special Class

Meeting with Guest, David Stork
Chief Scientist at Ricoh Innovations, consulting Professor at Stanford

Discussion meeting: 2:15-3:15, meeting in office of CS Dept. Chair.


ECE Talk 3:30 Goergen 101

When computers look at art: Image analysis in humanistic studies of visual arts.


Suggested Reading:

Duda, Hart and Stork. Pattern Classification (2nd Edition).



April 3


Undirected Models II - Boltzmann Machines
Discriminative and Generative Methods & Hybrid Techniques



Bouchard, G., Bias-Variance tradeoff in Hybrid Generative-Discriminative models, In J. Antoch, editor, Proceedings in Computational Statistics, 16th Symposium of IASC, volume 16, Prague. Physica-Verlag.


Student Presentation: Bin Wei


Additional Reading:

Druck, G., Pal, C., Zhu, X., and Andrew McCallum. (2007) Semi-Supervised Classification with Hybrid Generative/Discriminative Methods. In the proceedings of Knowledge Discovery and Data Mining (KDD).



April 10


Generalizing and Understanding Belief Propagation

Recent Views on Unifying Logic and Probability - Markov Logic



"Understanding Belief Propagation and its Generalizations" by Yeddidia, Freeman and

Weiss (Mitsubishi TR-2001-22).


Student Presentation (Tentative): Satyaki Mahalanabis - TBD


Additional Readings:

Matt Richardson and Pedro Domingos. Markov Logic Networks. Machine Learning, 62, 107-136, 2006.



April 17


Student project presentations:

Satyaki Mahalanabis

Nicholas Morsillo


Relevant Reading:

Morsillo et al., "Mining the Web for Visual Concepts." URCS Tech Report.


Martin Zinkevich, Online Convex Programming and Generalized Infinitesimal Gradient Ascent, ICML 2003.


With Some Possible Additional Discussion On:

More Markov Logic

Markov Decision Processes



April 24


Student project presentations:

Bin Wei

Paul Ardis


Tentative Additional Discussion on:

Bayesian Logic (BLOG)



Textbooks, References and Additional Reading


Pattern Recognition and Machine Learning, Chris Bishop, Springer 2006.

Information Theory, Inference and Learning Algorithms, David MacKay, Cambridge University Press 2003.

Pattern Classification (2nd Edition), Duda, Hart and Stork.

The Elements of Statistical Learning, Data Mining, Inference and Prediction. T. Hastie, R. Tibshirani and J.H.Friedman



Grading Scheme




Assignment #1 (Due March 20)


Assignment #2 (Due April 24)


Participation (Presenting papers)