Flashcards

10 cards showing.

Supervised vs. unsupervised classification — key differences?

essential classification

Two camps of pixel classification:

  • 👨‍🏫 Supervised — needs a teacher
    • Algorithm: Maximum Likelihood (most common)
    • A priori knowledge: required — you give it training samples
    • Control: user-driven — you set the class scheme
    • Strength: more accurate when training is good
  • 🤖 Unsupervised — clusters on its own
    • Algorithm: ISODATA (or K-means)
    • A priori knowledge: not required — no training
    • Control: computer-automated — it finds the groupings
    • Strength: reveals natural groupings, fast first pass

After unsupervised you still label the clusters by hand — that’s where Recode comes in.

💡

Supervised needs a TEACHER (training samples). Unsupervised is CLUSTERING — the computer invents the classes, you label afterward.

ISODATA — three parameters the user must set?

essential classification
  • N — max number of clusters (= max classes).
  • T — convergence threshold: % of pixels unchanged between iterations to declare convergence.
  • M — max iterations (safety cap).
💡

N-T-M: Number (clusters), Threshold (convergence %), Max-iterations. Terminates on whichever hits first: T reached or M hit.

Maximum Likelihood classifier — how it works and its main assumption?

essential classification

Picture each land-cover class (forest, water, city) as a fuzzy cloud floating in spectral space. Max Likelihood asks the question:

“Which cloud is this pixel most likely sitting inside?”

Crucially, it considers each cloud’s shape (covariance), not just its center. Min Distance only looks at the center — it’s like deciding which city you’re closest to without checking which one’s borders you’re inside.

That’s why ML usually wins on accuracy: it understands a tightly-grouped class is more confident, while a wide loose class lets in more variety.

🔬 Science / formula

For each pixel, compute the probability it belongs to each class. Assign to the most likely class.

  • Key assumption: each class is multivariate Gaussian (each band normally distributed inside the class).
  • Uses the class mean AND covariance matrix — that’s how it accounts for class shape, not just position.
  • Strength: most accurate of the four decision rules when classes are well-sampled and bands are roughly normal.
  • Weakness: needs enough training samples to estimate covariance; fails on non-Gaussian classes.
  • Bayesian variant (Hord 1982): supply per-class prior probabilities instead of assuming equal priors.
💡

ML uses **mean + covariance**, not just the mean. Min Distance ignores shape (just the mean). Mahalanobis adds shape. Max Likelihood adds priors on top. The full discriminant formula lives in the long-form review — the *concept* lives here.

Supervised classification — three-step procedure?

essential classification
  1. Select training samples — homogeneous AOIs per class.
  2. Generate & evaluate statistical signatures — mean, std, covariance per band.
  3. Class assignment via a decision rule (Min Distance, Max Likelihood, etc.).
💡

Train → Stats → Classify. You TEACH the computer with samples, it LEARNS signatures (mean + covariance), then APPLIES a decision rule to every pixel.

Four common decision rules for supervised classification?

likely classification

The four ways the computer assigns a pixel to a class, ranked by sophistication:

  • 📦 Parallelepiped
    • How: box-shaped regions defined by each class’s min/max in every band
    • Pro: simple, fast first pass
    • Con: leaves gaps + has corner overlap problems
  • 📏 Minimum Distance
    • How: assign to whichever class mean is closest (Euclidean)
    • Pro: no unclassified pixels, very fast
    • Con: ignores class shape, force-fits outliers
  • 📐 Mahalanobis Distance
    • How: Min Distance scaled by each class’s covariance
    • Pro: accounts for class shape (ellipsoidal)
    • Con: still assumes equal priors
  • 🎯 Maximum Likelihood
    • How: pick the most probable class, assuming Gaussian distributions
    • Pro: most accurate when classes are well-sampled
    • Con: needs lots of training data, fails on non-Gaussian classes

Minimum Distance classifier — advantages and disadvantages?

likely classification

✅ Advantages - ⚡ No unclassified pixels (every pixel has some nearest mean) - 🚀 Very fast decision rule

❌ Disadvantages - 🎯 Force-fits outlier pixels that should be flagged unclassified - 📐 Ignores class variability — treats tight clusters and loose clusters equally

💡

Min Distance: everyone gets a class (no gaps) but weird pixels get force-fit. Ignores shape. Opposite of Parallelepiped (which leaves gaps).

Parallelepiped classifier — advantages and disadvantages?

likely classification

✅ Advantages - ⚡ Very simple, very fast - 🎯 Good first-pass broad classification

❌ Disadvantages - 🕳️ Gap regions — pixels outside all boxes stay unclassified - ↗️ Corner overlaps — pixels in two boxes get assigned ambiguously

💡

Boxes. Pixel in a box → class. Pixel in NO box → unclassified (gap). Pixel in 2+ boxes (corner overlap) → wrong class. Fast but sloppy.

Anderson (1976) LULC — nine Level I classes?

likely classification
  • 🏙️ 1 — Urban / Built-up
  • 🌾 2 — Agricultural
  • 🌿 3 — Rangeland
  • 🌲 4 — Forest
  • 💧 5 — Water
  • 🪷 6 — Wetland
  • 🪨 7 — Barren Land
  • ❄️ 8 — Tundra
  • 🧊 9 — Perennial Snow / Ice

📚 Level II adds 37 subclasses total — used as the standard hierarchical scheme for RS data.

Covariance between two bands — formula?

maybe classification
\[C_{QR} = \frac{\sum_{i=1}^{k}(Q_i - \bar Q)(R_i - \bar R)}{k - 1}\]

Measures how Band Q and Band R vary together around their means. The full covariance matrix is what Max Likelihood and Mahalanobis use.

ISODATA convergence threshold T — worked example (10 pixels, 3 changed)?

maybe classification
  • changed = 3

  • unchanged = 7

  • T = unchanged ÷ total = 7/10 = 70%

If user set T = 95%, 70% is not enough → run another iteration.