Sigmoid Neuron

The activity of a neuron is very low or zero when the input is low, and the activity goes up and approaches some maximum as the input increases. This general behavior can be represented by a few activation functions.

Logistic Curve

Goes from 0 to 1.

Arctan

Goes from to instead.

Hyperbolic Tangent

Goes from to .

Threshold

This is just a Heaviside function.

Rectified Linear Unit (ReLU)

This is just a line that gets clipped below at zero:

Leaky ReLU is conceptually the same but goes a bit below zero, which can have some advantages

Softmax

Softmax depends on multiple neurons. Softmax is like a probability distribution (or probability vector), so its elements add to . If is the drive (input) to a set of neurons, then:

Then, by definition,

so they create a probability distribution. Thus, we can turn a list of inputs (that are not a distribution) and turn them into a distribution with softmax.

One-Hot

One-Hot is the extreme of the softmax, where only the largest element ramins nonzero, while the others are set to zero. Kind of like taking the limit of the softmax?