Model Optimization

Given an objective function or loss function, how do we optimize it?

We have a training set of ${x_{i}, y_{i}}$ of input/output pairs. We seek parameters $ϕ$ for the model $f [x_{i}, ϕ]$ that maps the inputs $x_{i}$ to the outputs $y_{i}$ as closely as possible. To this end, we have a loss/objective function $L [ϕ]$ that returns a single number that quantifies the mismatch in this mapping. The goal of an optimization algorithm is to find parameters $ϕ$ that minimize the loss:

\hat{ϕ} = ϕ argmin [L [ϕ]]

The objective/loss function defines some surface over $ϕ$ , and we want to find the $ϕ$ value at the lowest point on the surface.

Most standard optimization algorithms are iterative; they initially model parameters heuristically and then adjust them repeatedly in such a way that the loss decreases.

Gradient Descent
Vanishing + Exploding Gradient Problem

/notes/

Recent

Backpropagation Algorithm

Backpropagation Intuition

Backpropagation Toy Example

Model Optimization

Graph View

Backlinks