Suppose we have a data set of observations represented by row vector , representing observations of a scalar variable . These observations are drawn from a Gaussian whose parameters, mean and variance , are unknown. Given our observations, we want to estimate these parameters to find the distribution that they came from.
Data points that are drawn independently from the same distribution are independent and identically distributed (IID or i.i.d). The joint probability of independent events is given by the product of the marginal probabilities for each event separately. Because our dataset is i.i.d, we can therefore write the probability of the dataset, given and , as
When viewed as a function of and , this is called a likelihood function for the Gaussian.
In the diagram, 2.55 refers to the equation above.