2016 Poster Sessions : Minimax Estimation of KL Divergence between Discrete Distributions

Student Name : Yanjun Han
Advisor : Tsachy Weissman
Research Areas: Information Systems
We consider the problem of estimating the KL divergence between two discrete probability measures $P$ and $Q$ from empirical data in a non-asymptotic and possibly large alphabet setting. We construct minimax rate-optimal estimators for $D(P|Q)$ when the likelihood ratio is upper bounded by a constant which may depend on the support size, and show that the performance of the optimal estimator with $n$ samples is essentially that of the Maximum Likelihood Estimator (MLE) with $nln n$ samples. Our estimator is adaptive in the sense that it does not require the knowledge of the support size or the upper bound on the likelihood ratio. Our approach builds on the emph{Approximation} methodology recently developed for the construction of near minimax estimators of functionals of high-dimensional parameters, such as entropy, mutual information and $ell_1$ distance in large alphabet settings, and shows that the emph{effective sample size enlargement} phenomenon holds significantly more widely than previously established.

Yanjun Han received his B.Eng. degree with the highest honor in Electronic Engineering from Tsinghua University, Beijing, China in 2015. He is currently working towards the Ph.D. degree in the Department of Electrical Engineering at Stanford University. His research interests include information theory and statistics, with applications in communications, data compression, and learning.