Multivariate Inputs and Outputs

Often, we want to have our network map multivariate inputs $x = [x_{1}, x_{2}, \dots, x_{D_{i}}]^{T}$ to multivariate output predictions $y = [y_{1}, y_{2}, \dots, y_{D_{o}}]^{T}$ .

Multivariate Outputs

To extend the network to multivariate outputs $y$ , we simply use a different linear function of the hidden units for each output. So, a network with a scalar input $x$ , four hidden units $h_{1}, h_{2}, h_{3}, h_{4}$ , and a 2D multivariate output $y = [y_{1}, y_{2}]^{T}$ would be defined as

h_{1} h_{2} h_{3} h_{4} = a [θ_{10} + θ_{11} x] = a [θ_{20} + θ_{21} x] = a [θ_{30} + θ_{31} x] = a [θ_{40} + θ_{41} x]

and

y_{1} y_{2} = ϕ_{10} + ϕ_{11} h_{1} + ϕ_{12} h_{2} + ϕ_{13} h_{3} + ϕ_{14} h_{4} = ϕ_{20} + ϕ_{11} h_{1} + ϕ_{22} h_{2} + ϕ_{23} h_{3} + ϕ_{24} h_{4}

The two outputs are two different linear functions of the hidden units.

Recall (from Shallow Neural Network) that the joints in the piecewise functions depend on where the initial functions $θ_{∙ 0} + θ_{∙ 1} x$ are clipped by the ReLU functions $a [∙]$ at the hidden units.
Since both outputs $y_{1}$ and $y_{2}$ are different linear functions of the same four hidden units, the four joints are at the same places.
However, the slopes of the linear regions and the overall vertical offset can differ, since these are applied after the ReLU.

Multivariate Inputs

For multi variate inputs $x$ , we extend the linear relations between the input and the hidden units. So, a network with two inputs $x = [x_{1}, x_{2}]^{T}$ and a scalar output $y$ might have 3 hidden units defined by:

h_{1} h_{2} h_{3} = a [θ_{10} + θ_{11} x_{1} + θ_{12} x_{2}] = a [θ_{20} + θ_{21} x_{1} + θ_{22} x_{2}] = a [θ_{30} + θ_{31} x_{1} + θ_{32} x_{2}]

where there is now one slope parameter for each input. The hidden units are combined to form the output in the usual way:

y = ϕ_{0} + ϕ_{1} h_{1} + ϕ_{2} h_{2} + ϕ_{3} h_{3}

Each hidden unit receives a linear combination of the two inputs, which forms an oriented plane in the 3D input/output space.
The activation function clips the negative values of these planes to zero.
The clipped planes are then recombined in a second linear function to create a continuous piecewise linear surface consisting of convex polygonal regions.
Each region corresponds to a different activation pattern. For example, in the central triangular region, the 1st and 3rd hidden units are active, with the 2nd inactive.

With more than 2 inputs, the visualization becomes difficult, but the interpretation is similar; the output is just a continuous piecewise linear function of the output, where the linear regions are now convex polytopes in the multi-dimensional input space.

/notes/

Recent

Backpropagation Algorithm

Backpropagation Intuition

Backpropagation Toy Example

Multivariate Inputs and Outputs

Multivariate Outputs

Multivariate Inputs

Graph View

Table of Contents

Backlinks