N1H111SM's Miniverse

# Divergence Estimation Framework

2020/05/12 Share Materials

## The f-divergence Family

A large class of different divergences are the so called f-divergences, also known as the Ali-Silvey distances. Given two distributions $P$ and $Q$ that possess, respectively, an absolutely continuous density function $p$ and $q$ with respect to a base measure $dx$ defined on the domain $\mathcal{X}$ , we define the f-divergence:

## Jensen-Shannon Divergence

Definition (Jensen-Shannon Divergence) The Jensen–Shannon divergence (JSD) is a symmetrized and smoothed version of the Kullback–Leibler divergence $D(P|Q)$. It is defined by:

where $M=\frac{1}{2}(P+Q)$.

### Bound

The Jensen–Shannon divergence is bounded by 1 for two probability distributions, given that one uses the base 2 logarithm.

In generative models that rely on reconstruction (e.g., denoising, variational, and adversarial autoencoders), the reconstruction error can be related to the Mutual Information as follows:

where:

• $X$ and $Y$ denote the input and output of an encoder.
• $\mathcal{H}_{e}(X)$ and $\mathcal{H}_{e}(X|Y)$ denote the marginal and conditional entropy of $X$ in the distribution formed
by applying the encoder to inputs sampled from the source distribution.
• $\mathcal{R}_{e,d}(X|Y)$ denotes the expected reconstruction error of $X$ given the codes $Y$.