记录一些常见的概念。
Information entropy is the average rate at which information is produced by a stochastic source of data.
\[H[x]=-\sum_{x} p(x) \log _{2} p(x)\]
更多:https://en.wikipedia.org/wiki/Entropy_(information_theory)
\[{\displaystyle H(p,q)=-\sum _{x\in {\mathcal {X}}}p(x)\,\log q(x)= {\displaystyle H(p)+D_{\mathrm {KL} }(p\|q)}}\]
更多:https://en.wikipedia.org/wiki/Cross_entropy
也叫相对熵(relative entropy)
\[{\displaystyle D_{\text{KL}}(P\parallel Q)=\int _{-\infty }^{\infty }p(x)\log \left({\frac {p(x)}{q(x)}}\right)\,dx}\]
更多: https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence
也叫information radius,total divergence to the average
\[{\displaystyle {\rm {JSD}}(P\parallel Q)={\frac {1}{2}}D(P\parallel M)+{\frac {1}{2}}D(Q\parallel M)}\]
其中\({\displaystyle M={\frac {1}{2}}(P+Q)}\)
更多:https://en.wikipedia.org/wiki/Jensen%E2%80%93Shannon_divergence
\[{\displaystyle D_{f}(P\parallel Q)\equiv \int _{\Omega }f\left({\frac {dP}{dQ}}\right)\,dQ.}\]
更多:https://en.wikipedia.org/wiki/F-divergence
原文:https://www.cnblogs.com/cuhm/p/10628116.html