4th International Conference on Integrating GIS and Environmental Modeling (GIS/EM4):
Problems, Prospects and Research Needs. Banff, Alberta, Canada, September 2 - 8, 2000.

Image Interpretation and Segmentation with Hierarchical Probabilistic Models


Chris Pal, David Swayne, Brendan Frey


Raster based digital aerial imagery must be classified into relevant areas of interest for many environmental tasks. For example, applications such as forest resource management, the monitoring of ecosystem health and urban growth analysis utilize such classification information. Here we focus on extracting features relevant for landscape ecology. We show how graphical representations of probability models can be used to aid the construction of statistically principled classifiers.  The formalism of the graphical probability model facilitates the construction of complex models by illustrating the relationships between model sub-components. In our approach we construct a large Hierarchical Markov Random Field Model and use the model to classify and segment images into relevant features defined at different scales of resolution simultaneously.


Image Classification, Landscape Ecology, Aerial Image Analysis/Interpretation, Image Segmentation, Feature Extraction, Clustering, Markov Random Fields, Graphical Probability Models

Introduction and Problem Statement

The task of segmenting raster based digital aerial images into relevant areas of interest is extremely useful for aiding subsequent ecological analysis of the underlying area. However, the task of hand classifying large raster images is extremely time consuming.  Also, depending on the scale of resolution and analysis goals, different features may be of interest (e.g. a deciduous forest vs. a single oak tree). Thus, in some cases we wish to obtain different conceptual classifications for different resolutions of the same image. Additionally, often some of the features of interest are difficult to specify without spatial contextual analysis (e.g. features such as "a small grass field enclosed by trees").

Figure 1. Aerial imagery and an associated coarse scale hand classification.

Figure 1 illustrates the type of classifications that were of interest for our application at a relatively coarse resolution level.  Thus, our goal in this work was to construct a classifier that allowed ecologically relevant features in an image to be classified at different scales of resolution simultaneously.  Further, we wished to also utilize prior knowledge concerning likely spatial arrangements of landscape elements for each resolution level.


Our model addresses these goals above by allowing the notion of conceptual context and spatial context to be formalized and specified using a Graphical Probability Model (GPM). In our approach a hierarchically structured discrete state Markov Random Field (MRF) is combined with Gaussian Mixture classifiers constructed for individual pixel level observations and sliding windows of 16x16 pixels. Illustrating the MRF as a probability graph emphasizes the probabilistic interpretation of the model. In the past, specialized simulated annealing techniques such as Gibbs Sampling (Geman and Geman, 1984) have commonly been used for updating MRFs constructed for image restoration.  However, in our approach we use a MRF for image classification and segmentation and have used a probabilistic message passing procedure described in (Frey, 1998) to accomplish what can be thought of as contextual inference. This message passing procedure can be seen as an efficient evaluation of a large summations in the equation form of the classifiers specified in Equations (1) and (2) below. Further details can be found in (Frey, 1998).

Figure 2. (a) Left: Composing a GMM from clustered data. (b) Right: The fusion of a GMM with a discrete state MRF. Centre: The corresponding equations for a GMM and a MRF.

First, in our approach an image was labeled with respect to pixel level conceptual classes. A classifier was then constructed for pixel level classes.  The original image was then down-sampled by a factor of eight. A portion of this coarser resolution image was then labeled using different conceptual classes that more appropriately described the associated 16x16 pixel concepts. The Expectation Maximization algorithm (EM) algorithm (Dempster et.al., 1977) was then used to fit a Gaussian Mixture Model for each conceptual class at each of the two resolution levels.  The number of Gaussian mixtures was selected dynamically by introducing a minimum description length (MDL) penalty term into the likelihood calculation as described in (Bouman, 1995). For each resolution level, the mixture models for each separate class were composed using Bayesian calculus, into a distinct classifier for the respective resolution levels. The mixture model is illustrated graphically in Figure 2(a) and Equation (1) illustrates the corresponding computation. Once these classifiers were constructed, they were composed into a large image scale MRF model for each resolution level.  Likely spatial relationships between the highest-level classification variables were specified using frequency counts from the two conceptually labeled images. Figure 2(b) illustrates how the GMM was fused with a MRF, while Figure 3 (a) illustrates the structure of the model over a group of observations (that could be either pixel level or window level). Equation (2) represents the fundamental computation used to update the GMM-MRF, where v represents a variable in the graph (circles), Q is the set of all functions or probability tables and Vq represents the variables involved in function q (rectangles).  In this way two initially separate lattices with conceptually different classes were constructed for the two resolution levels.

Figure 3. (a) Left: A section of the larger model for a single resolution level. (b) Centre: The coupling of the two image level models. (c) Right: "Mother Wavelets" used for initial basis.

To classify a new image, classifications at the coarser scale analysis are initially generated based only on the GMM. Then, probabilistic message passing is used to "update" the initial classification based on the likely spatial configurations, as illustrated by the arrows in Figure 3(a). Conceptual contextual information from the coarser scale analysis was then incorporated into the pixel level MRF using probabilistic messages from the MRF at the coarser resolution level as illustrated in Figure 3(b).  The pixel level MRF then uses message passing to take into account likely pixel level spatial relationships. Figure 3(c) also illustrates the 4th order anisotropic Daubechies mother wavelets that were used for an initial change of basis for the 16x16 pixel observation windows.

Results and Conclusion

We applied this procedure to the task of automating the interpretation and segmentation of relatively high resolution (one meter per pixel or better) orthographic aerial imagery of a rural area. Results of our approach are illustrated in Figure 4.  For simplicity of illustration we have selected the pixel level classes to be the same as the window level classes. However, in practice the classes would be different.  Note also that the imagery was grayscale, thus the pixel level classification was particularly ambiguous. For realistic applications, pixel level classification would only be done for multi-spectral imagery. Additonally, for some ecological applications more highly specialized classifiers may be required.  However, our approach can be used to provide a good initial "guess" to seed such  specialized and potentially more computationally expensive classifiers.

Figure 4. Left to right: (a) The GMM window level classification. (b) The MRF with GMM substructure classifier after three iterations of lateral probabilistic message passing. (c) The GMM pixel level classification. (d) A classification using the complete model.

References used

Bouman, C. A. 1995. CLUSTER: An unsupervised algorithm for modeling Gaussian mixtures. Technical Report. Purdue University. (Online, June 2000) http://www.ece.purdue.edu/~bouman/software/cluster/manual.pdf

Dempster, A. P., Laird, M., and Rubin D. B. 1977. Maximum likelihood from incomplete data via the EM algorithm. Proceedings of the Royal Statistical Society, vol. B-39, pp. 1-38.

Frey, B. 1998. Graphical Models for Machine Learning and Digital Communication. MIT Press: Cambridge, MA.

Geman, S. and Geman, D. 1984. Stochastic Relaxation, Gibbs Distributions and the Bayesian Restoration of Images.  IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-6, no. 6, November.


Chris Pal, PhD Candidate, Brenday Frey, Professor
University of Waterloo, Waterloo, Ontario, Canada  N2L 3G1
Email:cjpal@uwaterloo.ca, frey@uwaterloo.ca, Tel: + 1-519-888-4567

Dave Swayne, Professor, Head of the Computing Research Laboratory for the Environment
University of Guelph, Guelph, Ontario, Canada N1G 2W1
Email:dswayne@snowhite.cis.uoguelph.ca, Tel: +1-519-824-4120 ext. 3411