Instead of trying to intuit learning algorithms like the Perceptron, we can introduce a general framework for solving machine learning problems that lets us derive machine learning algorithms for arbitrarily complicated problems. This is done by framing machine learning as an optimization problem: using computational methods to find the minimum/maximum of a given function.
Fundamentally, we define an objective function , where are the parameters of our model. For a Linear Classifier, we would have . We also write to indicate dependence on the data . Generally, the optimization is that we want to find such that:
which is to say that we want to find the that minimizes .
A common objective function for machine learning is:
where:
- is a Loss Function, which makes the first term equivalent to training error.
- is a regularizer, and is a constant hyperparameter governing how well we want to fit to training data.