A corner is an intersection of edges. It’s visually distinctive because movement in any direction causes a large change in appearance.

Shifting an image window:
- Flat region: Shifting looks almost the same
- Edge: shifting along the looks similar, across looks different
- Corner: Shifting in any direction changes appearance a lot
Harris Corner Detector
The goal of Harris corner detection is to detect points where small shifts cause large differences in image intensity.
This uses the SSD similarity measure. We compare a patch at with the same patch shifted by :
However, working with the last term directly is expensive, so we approximate it using a Taylor expansion:
Substituting back:
In matrix form:
where
is the second moment matrix that summarizes how intensity changes around the pixel.
The eigenvalues of can tell us whether the corner type:
- Both eigenvalues are small: Flat region
- One large, one small: Edge
- Both large: Corner
Rather than computing eigenvalues directly, Harris uses:
- R large positive → corner
- R near zero → flat region
- R negative → edge
SIFT
Harris corners work only at one fixed scale. If you zoom in or out, the detected corners change. SIFT (Scale Invariant Feature Transform) was designed to detect the same feature even if the image is rotated, viewpoint is changed, scaled, or illuminated differently.
DoG: First, we create multiple blurred versions of the same image using Gaussians of increasing blur (). This gives us image layers from sharp to blurry. Then, the images are downsampled, forming lower resolution copies. This is called a Gaussian pyramid.
Then, we use DoG (Difference of Gaussians) to search multiple scales:
which highlights regions where intensity changes sharply.

SIFT keypoint seleciton: Then, we select keypoints by comparing each point to 26 neighbors (8 in the same scale, 9 in the scale above, 9 in the scale below). If it’s a maximum or a minimum, it becomes a keypoint candidate.
Keypoint refinement: Some extrema come from noise, unstable edges, or weak textures. Thus, SIFT uses interpolation to refine keypoint location and reject weak ones.
Remove low contrast/poor keypoints: Only strong, distinctive keypoints are kept.
Orientation assignment: For each keypoint, compute the dominant gradient direction. Then locate the local patch so that this direction becomes . This means that if the object rotates, the descriptor rotates with it, making SIFT rotation invariant.
SIFT detector privileges highly informative image patches rather than simple edges. It doesn’t just find corners or edges, but find stable, repeatable patterns with a strong local gradient structure.