Given an objective function, how do we optimize it?

The main idea is if we’re trying to optimize a function f, we compute the derivative at our current point and move in the direction that makes the function smaller. In terms of a machine learning model, we would have some objective function defining some surface over , and we want to find the value at the lowest point on the surface.