Supervised Learning

A supervised learning model defines a mapping from one or more inputs to one or more outputs. The model is a mathematical equation; when the inputs are passed through tis equation, it computes the output (inference). The model equation also contains parameters. Different parameter values change the outcome of the computation; the model equation describes a family of possible input-output mappings, and the parameters specify the particular relationship.

When we train or learn a model, we find parameters that describe the true relationship between inputs and outputs. A learning algorithm takes a training set of input/output pairs and manipulates the parameters until the inputs predict their corresponding outputs as closely as possible.

Specifically, we aim to build a model that takes an input $x$ and outputs a prediction $y$ , both of which are vectors. To make the prediction, we need a model $f [\cdot]$ that takes the input $x$ and returns $y$ , so:

y = f [x]

The model is a mathematical equation with a fixed form, representing a family of relations between input and output. The model contains parameters $ϕ$ , where the choice of parameters determines the particular relation between input and output:

y = f [x, ϕ]

We learn these parameters using a training dataset of $I$ pairs of input and output examples ${x_{i}, y_{i}}$ . We aim to select parameters that map each training input to its associated output as closely as possible. We quantify the degree of mismatch in this mapping with the loss $L$ . This is a scalar value that summarizes how poorly the model predicts the training outputs from their corresponding inputs for parameters $ϕ$ .

We can treat the loss as a function $L [ϕ]$ of these parameters. When we train the model, we are seeking parameters $\hat{ϕ}$ that minimize this loss function:

\hat{ϕ} = ϕ argmin [L [ϕ]]

If the loss is small after this minimization, we have found model parameters that accurately predict the training outputs $y_{i}$ from the training inputs $x_{i}$ .

After training a model, we assess its performance by running the model on a separate test data to see how well it generalizes to examples that it didn’t observe during training.

/notes/

Recent

Linearization of Nonlinear State Space Models

Phase Portrait

Embed to Control

Supervised Learning

Graph View

Backlinks