We can extend the definition of entropy to include distributions over continuous variables .
We can do this by dividing into “bins” of width . Then, assuming is continuous, the mean value theorem tells us that, for each such bin, there must exist a value in the range such that
We can now quantize the continuous variable by saying any within the th bin is approximated/quantized to the value . Since represents all values in its bin, the probability of observing the value is then the integral of over the bin (above), which is .
The entropy of this discrete distribution gives:
We omit the second term on the right hand side, since it’s independent of .
Now we consider the limit , which gives:
The boxed quantity on the right-hand side is called the differential entropy. The discrete and continuous forms of the entropy differ by a quantity , which diverges in the limit .
For a density defined over multiple continuous variables denoted by the vector , the differential entropy is given by