Multi-layer Neural Network

Multi-layer neural networks allow us to express more complex hypotheses, as a Neural Network Layer can only really express linear separators. Multi-layer networks neural, as the name suggests, combine multiple layers, most typically by feeding the outputs of one layer into the inputs of another layer.

Nomenclature:

We use $l$ to name a layer.
$m^{l}$ is the number of inputs to layer
$n^{l}$ is the number of outputs from the layer
$W^{l}$ and $W_{0}^{l}$ are of shape $m^{l} \times n^{l}$ and $n^{l} \times 1$ respectively
$f^{l} (\cdot)$ is the activation function of layer $l$

Then, the pre-activation outputs are the $n^{l} \times 1$ vector

Z^{l} = (W^{l})^{T} A^{l - 1} + W_{0}^{l}

and the activation outputs the $n^{l} \times 1$ vector

A^{l} = f^{l} (Z^{l})

Activation function uniformality

It is technically possible to have different activation functions within the same layer, but for convenience in specification and implementation, we generally have the same activation function within a layer.

Below is a diagram of a many-layered network, with two blocks for each layer, one representing the linear part of the operation and one representing the non-linear activation function. This structural decomposition is useful to organize our algorithmic thinking and implementation.

/notes/

Recent

Linearization of Nonlinear State Space Models

Phase Portrait

Embed to Control

Multi-layer Neural Network

Graph View

Backlinks