26/08/2020

Gaussian Sketching yields a J-L Lemma in RKHS

Samory Kpotufe, Bharath Sriperumbudur

Keywords:

Abstract: The main contribution of the paper is to show that Gaussian sketching of a kernel-Gram matrix $\bm K$ yields an operator whose counterpart in an RKHS $\cal H$, is a \emph{random projection} operator---in the spirit of Johnson-Lindenstrauss (J-L) lemma. To be precise, given a random matrix $Z$ with i.i.d. Gaussian entries, we show that a sketch $Z\bm{K}$ corresponds to a particular random operator in (infinite-dimensional) Hilbert space $\cal H$ that maps functions $f \in \cal H$ to a low-dimensional space $\bb R^d$, while preserving a weighted RKHS inner-product of the form $\langle f, g \rangle_{\Sigma} \doteq \langle f, \Sigma^3 g \rangle_{\cal H}$, where $\Sigma$ is the \emph{covariance} operator induced by the data distribution. In particular, under similar assumptions as in kernel PCA (KPCA), or kernel $k$-means (K-$k$-means), well-separated subsets of feature-space $\{K(\cdot, x): x \in \cal X\}$ remain well-separated after such operation, which suggests similar benefits as in KPCA and/or K-$k$-means, albeit at the much cheaper cost of a random projection. In particular, our convergence rates suggest that, given a large dataset $\{X_i\}_{i=1}^N$ of size $N$, we can build the Gram matrix $\bm K$ on a much smaller subsample of size $n\ll N$, so that the sketch $Z\bm K$ is very cheap to obtain and subsequently apply as a projection operator on the original data $\{X_i\}_{i=1}^N$. We verify these insights empirically on synthetic data, and on real-world clustering applications.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at AISTATS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers

 16:25