The basic element of a neural network is called a neuron. This is also sometimes called a Perceptron but I prefer to use perceptron to refer to the learning algorithm; it is also referred to as “unit” or “node”.

The neuron is essentially a non-linear function of an input vector to a single value . It is parametrized by:

  • A vector of weights and an offset/threshold .
  • An activation function , which gives us non-linearity.

In total, the function represented by the neuron can be summarized as:

  • The final formulation is basically just the activation function applied to linear classifier.


How do we train a single unit? Given a loss function , and a dataset , we can do gradient descent, adjusting the weights to minimize:

where is the output of our neural net for a given input.

Linear classifiers with hinge loss and regression with quadratic loss, which we’ve studied, are two special cases of the neuron; both of them have an activation function of .