The extension to multi-dimensional from 1D Gradient Descent is fairly straightforward.
Let us assume our parameters , so that our objective function . The gradient of with respect to is:
The algorithm remains the same as the 1D case, except that the update step (line 5 in 1D Gradient Descent) becomes:
Remember that the gradient points in the direction of the greatest rate of increase. Thus, we take the negative of it to descend.