Covariance Matrix Adaptation ES

We saw that there are several ways to do ES Gaussian Mutation: single global step, uncorrelated, and correlated. We can generalize this into just saying:

x_{k} \sim N (m, σ^{2} C)

where:

mean $m \in R^{n}$ is the current search mean
$σ > 0$ is the global step size
$C \in R^{n \times n}$ is the covariance matrix that defines the shape and orientation of the search distribution

In 2 dimensions:

C = B D^{2} B^{T}

where

D = (σ_{1} 0 0 σ_{2}), B = (cos α sin α - sin α cos α)

Recall that in classical correlated ES the chromosome was:

⟨ x_{1}, x_{2}, σ_{1}, σ_{2}, α ⟩

CMA-ES replaces explicit parameter mutation by learning the covariance matrix $C$ directly.

$C$ is typically learned through evolution paths; after sampling $λ$ offspring, CMA-ES ranks them by fitness and keeps the best ones. Then:

The mean $m$ is updated to move toward better samples
The covariance matrix $C$ is updated to reinforce successful directions
The global step size $σ$ is increased when progress suggests larger moves, and decreased when refinement is needed.

CMA-ES is considered state-of-the-art; it’s much more principled than just correlated ES, because we are essentially still adaptively guessing in CMA-ES.

/notes/

Recent

Semantic Segmentation

YOLO

VGG

Covariance Matrix Adaptation ES

Graph View

Backlinks