06/12/2020

Efficient active learning of sparse halfspaces with arbitrary bounded noise

Chicheng Zhang, Jie Shen, Pranjal Awasthi

Keywords:

Abstract: We study active learning of homogeneous $s$-sparse halfspaces in $\mathbb{R}^d$ under the setting where the unlabeled data distribution is isotropic log-concave and each label is flipped with probability at most $\eta$ for a parameter $\eta \in \big[0, \frac12\big)$, known as the bounded noise. Even in the presence of mild label noise, i.e. $\eta$ is a small constant, this is a challenging problem and only recently have label complexity bounds of the form $\tilde{O}(s \cdot polylog(d, \frac{1}{\epsilon}))$ been established in [Zhang 2018] for computationally efficient algorithms. In contrast, under high levels of label noise, the label complexity bounds achieved by computationally efficient algorithms are much worse: the best known result [Awasthi et al. 2016] provides a computationally efficient algorithm with label complexity $\tilde{O}((s ln d/\epsilon)^{poly(1/(1-2\eta))})$, which is label-efficient only when the noise rate $\eta$ is a fixed constant. In this work, we substantially improve on it by designing a polynomial time algorithm for active learning of $s$-sparse halfspaces, with a label complexity of $\tilde{O}\big(\frac{s}{(1-2\eta)^4} polylog (d, \frac 1 \epsilon) \big)$. This is the first efficient algorithm with label complexity polynomial in $\frac{1}{1-2\eta}$ in this setting, which is label-efficient even for $\eta$ arbitrarily close to $\frac12$. Our active learning algorithm and its theoretical guarantees also immediately translate to new state-of-the-art label and sample complexity results for full-dimensional active and passive halfspace learning under arbitrary bounded noise.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers