For the water signature example: cov(Band 1, Band 2) = 7.
The covariance matrix is an n × n matrix containing all variances and covariances within the n bands of data — this is what Max Likelihood and Mahalanobis need.
For the water class example, cov(Band 1, Band 2) = 7.
A covariance matrix is an n × n matrix containing all variances and covariances
across the n bands of a signature. This is what the Maximum Likelihood and Mahalanobis
rules need.
slide 7
Decision rules
A mathematical algorithm that sorts pixels into classes. The four to know:
A decision rule is the algorithm that actually sorts a pixel into a class.
Parallelepiped — upper/lower bounds per band per class.
Minimum Distance — nearest class mean in feature space.
Maximum Likelihood / Bayesian — highest probability, assumes Gaussian classes.
Mahalanobis Distance — like Min Distance but scaled by class covariance.
slide 8 (picture)
Parallelepiped (1) — the concept
Have upper and lower limits for every signature in every band. If a pixel's values fall inside the resulting n-dimensional box, it's assigned to that signature's class.
Diagram uses ±1 standard deviation as the limits. Source: ERDAS Field Guide 2002 p. 229.
Minimum Distance (2) — advantages and disadvantages.
Advantages:
- No unclassified pixels (every pixel has a nearest mean).
- Fast decision rule.
Disadvantages:
- Pixels that should be unclassified — because they’re not close to any training
class — will still be assigned somewhere (a “force-fit”).
- Does not consider class variability (two classes with very different spreads are
treated equally).
slide 12 (formula)
Maximum Likelihood (1)
Based on the probability that a pixel belongs to a particular class:
$$ p_{c} > p_{i} \qquad\text{for all } i = 1, 2, 3, \ldots, m\ \text{possible classes} $$
A pixel at x is assigned to class c if the likelihood that the correct class is c is the largest.
Source: ERDAS IMAGINE v8.3 Professional Training Reference Manual 1997 p. 29.
The most commonly used decision rule in supervised classification.
Assumptions:
Prior probabilities are equal for all classes (drop this → Bayesian rule, slide 15).
Each input band has a normal distribution inside each class.
Works well when classes are well-sampled and spectrally distinct; struggles on rare
classes or non-Gaussian distributions.
slide 15
Maximum Likelihood / Bayesian (3)
Bayesian decision rule — if the user has prior knowledge that the probabilities are not equal for all classes, they can specify weight factors per class. This variation of Max Likelihood is called the Bayesian decision rule (Hord 1982).
Characteristics:
The most accurate classifier (when assumptions hold).
Takes the most variables into account.
If band histograms are not normally distributed, Parallelepiped or Min Distance may actually give better results.
Bayesian variant — if the user has prior knowledge that class probabilities are
NOT equal, they can specify weight factors per class.
Reference: Hord 1982.
Characteristics:
- The most accurate classifier of the four, when assumptions hold.
- Takes the most variables into account (means, covariances, priors).
- But: if the bands are not normally distributed, Parallelepiped or Min Distance
may actually give better results.
slide 16 (formula)
Maximum Likelihood / Bayesian (4) — full rule
The pixel is assigned to the class for which D is the lowest:
ac — probability that class c occurs in the image (equal for all classes, or entered from a priori knowledge).
The last term (X − Mc)T Vc−1 (X − Mc) is the squared Mahalanobis distance — distance from the pixel to the class mean, scaled by the class's covariance. Mahalanobis alone (without the log-determinant and log-prior terms) is its own decision rule.
a_c — prior probability that class c occurs in the image (equal for all classes,
or user-entered from a priori knowledge).
The third term is the Mahalanobis distance squared — distance from the pixel to
the class mean, scaled by the class’s covariance. Mahalanobis alone (without the
first two log terms) is its own decision rule.