Due to the nature of Transformation of Densities, the concept of the maximum of a probability density is dependent on the choice of variable.
For a single variable , suppose that has a mode (i.e. a maximum) at such that . We have:
The corresponding mode will occur at value obtained by differentiating both sides with respect to :
Assuming at the node, then . We know that , so we see that , as we would expect. Thus, finding a mode with respect to the variable is equivalent to first transforming to the variable , then finding a mode with respect to , and then transforming back to .
For a density , and new density , transformed under we can write:
where we simplify the modulus by choosing such that and is always positive. Differentiating both sides with respect to gives:
Due to the presence of the second term on the the right side, the relationship no longer holds. Thus, the value of obtained by maximizing will not be the value obtained by transforming to then transforming back to . This causes modes of densities to be dependent on the choice of variables.
In the example above, the original Gaussian is sampled 50000 times to obtain a histogram. Each point is then transformed from to with:
The inverse of this is:
which is a logistic sigmoid function.
If we simply transform as a function of , we obtain the green curve ; the mode is transformed via the sigmoid function as well. However, the density of , shown by the magenta curve, transforms instead according to our previously derived equations; its mode is shifted relative to the mode of the green curve.