A Bayesian characterization of relative entropy (Q2877683)

From MaRDI portal





scientific article; zbMATH DE number 6334034
Language Label Description Also known as
English
A Bayesian characterization of relative entropy
scientific article; zbMATH DE number 6334034

    Statements

    0 references
    0 references
    25 August 2014
    0 references
    relative entropy
    0 references
    Kullback-Leibler divergence
    0 references
    measures of information
    0 references
    cs.IT
    0 references
    math-ph
    0 references
    math.IT
    0 references
    math.MP
    0 references
    math.PR
    0 references
    quant-ph
    0 references
    A Bayesian characterization of relative entropy (English)
    0 references
    This paper gives a new characterization of the concept of relative entropy, aka \textit{relative information}, \textit{relative gain} or \textit{Kullback-Leibler divergence}. Whenever we have two probability distributions \(p\) and \(q\) on the same set \(X\), we define the information of \(q\) relative to \(p\) asNEWLINE\[CARRIAGE_RETURNNEWLINES(q,p)=\sum_{x\in X}q_{x}\ln\left( \frac{q_{x}}{p_{x}}\right)CARRIAGE_RETURNNEWLINE\]NEWLINEwhere \(q_{x}\ln\left( \frac{q_{x}}{p_{x}}\right) \) is set equal to \(\infty\) when \(p_{x}=0\), unless \(q_{x}\) is also \(0\), in which case it is set equal to \(0\).NEWLINENEWLINEBayesian probability theory emphasizes the role of the prior so that relative entropy naturally lends itself to a Bayesian interpretation [\textit{P. Baldi} and \textit{L. Itti}, Neural Netw. 23, No. 5, 649--666 (2010; Zbl 1401.62225)]. The goal of this paper is to make this precise in a mathematical characterization of relative entropy. The authors consider a category \(\mathtt{FinStat}\), where an object \((X,q)\) is a finite set \(X\) gifted with a probability distribution \(x\mapsto q_{x}\), while a morphism \((f,s):(X,q)\rightarrow(Y,r)\) is a measure-preserving function \(f:X\rightarrow Y\) hand in hand with a probability distribution \(x\mapsto s_{xy}\) on \(X\) for each element \(y\in Y\) with the property \(s_{xy}=0\) unless \(f(x)=y\). NEWLINENEWLINEIntuitively speaking, an object of \(\mathtt{FinStat}\) is to be thought of a system with some finite set of states as well as a probanility distribution on it. A morphism \((f,s):(X,q)\rightarrow(Y,r)\) is a deterministic measuring process \(f:X\rightarrow Y\) mapping states of some system under measurement to those of a measuring apparatus as well as a hypothesis \(s\) meaning the probability \(s_{xy}\) that the system under measurement is in the state \(x\) given any measurement outcome \(y\in Y\).NEWLINENEWLINEGiven a morphism \((f,s):(X,q)\rightarrow(Y,r)\) in \(\mathtt{FinStat}\), the authors defineNEWLINE\[CARRIAGE_RETURNNEWLINE\mathrm{RE}(f,s)=S(q,p)CARRIAGE_RETURNNEWLINE\]NEWLINEwhereNEWLINE\[CARRIAGE_RETURNNEWLINEp_{x}=s_{xf(x)}r_{f(x)}CARRIAGE_RETURNNEWLINE\]NEWLINEand \(s\) is said to be \textit{optimal} as long as the above equation gives a prior \(p\) equal to the true probability distribution \(q\) on the states of the system under measurement. It is nontrivial and rather interesting to establish the fact thatNEWLINE\[CARRIAGE_RETURNNEWLINE\mathrm{RE}:\mathtt{FinStat}\rightarrow [0,\infty]CARRIAGE_RETURNNEWLINE\]NEWLINEwhere \([0,\infty]\) is thought of a category with one object, the nonnegative real numbers with \(\infty\) as morphisms whose composition is simply addition. The functoriality of \(\mathrm{RE}\) claims that, givenNEWLINE\[CARRIAGE_RETURNNEWLINE(X,q) \xrightarrow{(f,s)} (Y,r) \xrightarrow{(g,t)} (Z,u)CARRIAGE_RETURNNEWLINE\]NEWLINEwe haveNEWLINE\[CARRIAGE_RETURNNEWLINE\mathrm{RE}\left((g,t) \circ (f,s)\right)=\mathrm{RE}(g,t) +\mathrm{RE}(f,s)CARRIAGE_RETURNNEWLINE\]NEWLINEThe main result of this paper (Theorem 3.1), which was inspired by \textit{D. Petz} [Acta Math. Hung. 59, No. 3--4, 449--455 (1992; Zbl 0765.46045)] in both its formulation and its proof, is that \(\mathrm{RE}\) is, up to constant multiples, the unique functor from \(\mathtt{FinStat}\) to \([0,\infty]\) obeying the following three conditions:NEWLINENEWLINE\begin{itemize}NEWLINE\item[1.] \(\mathrm{RE}\) vanishes on morphisms with an optimal hypothesis.NEWLINE\item[2.] \(\mathrm{RE}\) is lower semicontinuous.NEWLINE\item[3.] \(\mathrm{RE}\) is convex linear.NEWLINE\end{itemize}
    0 references

    Identifiers