Exploiting Local Convergence of Quasi-Newton Methods Globally: Adaptive Sample Size Approach

06/12/2021

Exploiting Local Convergence of Quasi-Newton Methods Globally: Adaptive Sample Size Approach

Qiujiang Jin, Aryan Mokhtari

Keywords: optimization

Abstract Paper Similar Papers

Abstract: In this paper, we study the application of quasi-Newton methods for solving empirical risk minimization (ERM) problems defined over a large dataset. Traditional deterministic and stochastic quasi-Newton methods can be executed to solve such problems; however, it is known that their global convergence rate may not be better than first-order methods, and their local superlinear convergence only appears towards the end of the learning process. In this paper, we use an adaptive sample size scheme that exploits the superlinear convergence of quasi-Newton methods globally and throughout the entire learning process. The main idea of the proposed adaptive sample size algorithms is to start with a small subset of data points and solve their corresponding ERM problem within its statistical accuracy, and then enlarge the sample size geometrically and use the optimal solution of the problem corresponding to the smaller set as an initial point for solving the subsequent ERM problem with more samples. We show that if the initial sample size is sufficiently large and we use quasi-Newton methods to solve each subproblem, the subproblems can be solved superlinearly fast (after at most three iterations), as we guarantee that the iterates always stay within a neighborhood that quasi-Newton methods converge superlinearly. Numerical experiments on various datasets confirm our theoretical results and demonstrate the computational advantages of our method.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Consistent Estimation for PCA and Sparse Regression with Oblivious Outliers

Tommaso d'Orsi, Chih-Hung Liu, Rajai Nasser and
Gleb Novikov, David Steurer, Stefan Tiegel

Keywords Paper

optimization

0

0

0

0

10:44

18/07/2021

Lenient Regret and Good-Action Identification in Gaussian Process Bandits

Xu Cai, Selwyn Gomes, Jonathan Scarlett

Keywords Paper

Probabilistic Methods, Gaussian Processes and Bayesian non-parametrics

0

0

0

0

5:10

06/12/2020

Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization

Jonathan Lacotte, Mert Pilanci

Keywords Paper

0

0

0

0

3:17

06/12/2020

An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

Julian Katz-Samuels, Lalit Jain, zohar karnin, Kevin Jamieson

Keywords Paper

0

0

0

0

3:20

09/07/2020

A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates

Zhixian Lei, Kyle Luh, Prayaag Venkat, Fred Zhang

Keywords Paper

High-dimensional statistics, Adversarial learning and robustness

0

0

0

0

15:00

06/12/2021

List-Decodable Mean Estimation in Nearly-PCA Time

Ilias Diakonikolas, Daniel Kane, Daniel Kongsgaard and
Jerry Li, Kevin Tian

Keywords Paper

theory, clustering

0

0

0

0

14:21

12/07/2020

Optimization from Structured Samples for Coverage Functions

Wei Chen, Xiaoming Sun, Jialin Zhang, Zhijie Zhang

Keywords Paper

Optimization - General

0

0

0

0

14:22

04/08/2021

Group testing and local search: is there a computational-statistical gap?

Fotis Iliopoulos, Ilias Zadik

Keywords Paper

0

0

0

0

17:50

06/12/2020

Distributionally Robust Federated Averaging

Yuyang Deng, Mohammad Mahdi Kamani, Mehrdad Mahdavi

Keywords Paper

0

0

0

0

3:11

18/07/2021

Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning

Yaqi Duan, Chi Jin, Zhiyuan Li

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

5:18

06/12/2020

List-Decodable Mean Estimation via Iterative Multi-Filtering

Ilias Diakonikolas, Daniel Kane, Daniel Kongsgaard

Keywords Paper

0

0

0

0

3:12

26/08/2020

Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization

Kenji Kawaguchi, Haihao Lu

Keywords Paper

0

0

0

0

14:10

06/12/2021

Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD

Rémi Bardenet, Subhroshekhar Ghosh, Meixia LIN

Keywords Paper

optimization, machine learning

0

0

0

0

14:51

09/07/2020

The estimation error of general first order methods

Michael V Celentano, Andrea Montanari, Yuchen Wu

Keywords Paper

High-dimensional statistics, Computational complexity, Matrix/tensor estimation, Regression

0

0

0

0

14:10

18/07/2021

Communication-Efficient Distributed Optimization with Quantized Preconditioners

Foivos Alimisis, Peter Davies, Dan Alistarh

Keywords Paper

Optimization, Distributed and Parallel Optimization

0

0

0

0

5:33

18/07/2021

Robust Density Estimation from Batches: The Best Things in Life are (Nearly) Free

Ayush Jain, Alon Orlitsky

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

17:23

06/12/2020

Fourier Sparse Leverage Scores and Approximate Kernel Learning

Tamas Erdelyi, Cameron Musco, Christopher Musco

Keywords Paper

0

0

0

0

3:25

18/07/2021

Exact Optimization of Conformal Predictors via Incremental and Decremental Learning

Giovanni Cherubin, Konstantinos Chatzikokolakis, Martin Jaggi

Keywords Paper

Probabilistic Methods

0

0

0

0

5:48

12/07/2020

Composable Sketches for Functions of Frequencies: Beyond the Worst Case

Edith Cohen, Ofir Geri, Rasmus Pagh

Keywords Paper

Optimization - Large Scale, Parallel and Distributed

0

0

0

0

14:51

06/12/2020

Minimax Estimation of Conditional Moment Models

Nishanth Dikkala, Greg Lewis, Lester Mackey, Vasilis Syrgkanis

Keywords Paper

0

0

0

0

3:04

03/05/2021

Learning-based Support Estimation in Sublinear Time

talyaa01 Eden, Piotr Indyk, Shyam Narayanan and
Ronitt Rubinfeld, Sandeep Silwal, Tal Wagner

Keywords Paper

chebyshev polynomial, distinct elements, learning-based, sublinear, support estimation

0

0

0

0

9:48

06/12/2020

Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Evolvability

Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau

Keywords Paper

0

0

0

0

3:26

26/04/2020

Sign-OPT: A Query-Efficient Hard-label Adversarial Attack

Minhao Cheng, Simranjit Singh, Patrick H. Chen and
Pin-Yu Chen, Sijia Liu, Cho-Jui Hsieh

Keywords Paper

0

0

0

0

4:56

06/12/2021

Fast Minimum-norm Adversarial Attacks through Adaptive Norm Constraints

Maura Pintor, Fabio Roli, Wieland Brendel, Battista Biggio

Keywords Paper

optimization, machine learning, robustness, adversarial robustness and security, vision

0

0

0

0

11:35

12/07/2020

Manifold Identification for Ultimately Communication-Efficient Distributed Optimization

Yu-Sheng Li, Wei-Lin Chiang, Ching-pei Lee

Keywords Paper

Optimization - Large Scale, Parallel and Distributed

0

0

0

0

15:12

14/09/2020

A General Machine Learning Framework for Survival Analysis

Andreas Bender, David Rügamer, Fabian Scheipl, Bernd Bischl

Keywords Paper

survival analysis, gradient boosting, neural networks, competing risks, multi-state models

0

0

0

0

13:37

22/06/2020

Learning mixtures of linear regressions in subexponential time via fourier moments

Sitan Chen, Jerry Li, Zhao Song

Keywords Paper

Fourier analysis, unsupervised learning, Mixture models, linear regression, method of moments

0

0

0

0

24:33

12/07/2020

A simpler approach to accelerated optimization: iterative averaging meets optimism

Pooria Joulani, Anant Raj, András György, Csaba Szepesvari

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

1

1

16:17

06/12/2021

Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms

Alexander Camuto, George Deligiannidis, Murat Erdogdu and
Mert Gurbuzbalaban, Umut Simsekli, Lingjiong Zhu

Keywords Paper

theory, deep learning, optimization

0

0

0

0

14:36

06/12/2020

Fast Epigraphical Projection-based Incremental Algorithms for Wasserstein Distributionally Robust Support Vector Machine

Jiajin Li, Caihua Chen, Anthony Man-Cho So

Keywords Paper

Algorithms -> Meta-Learning; Applications -> Object Recognition; Data, Challenges, Implementations, and Software -> Benchmarks;, Algorithms -> Multitask and Transfer Learning

0

0

0

0

3:02

06/12/2020

Linear-Sample Learning of Low-Rank Distributions

Ayush Jain, Alon Orlitsky

Keywords Paper

0

0

0

0

3:22

06/12/2020

Escaping Saddle-Point Faster under Interpolation-like Conditions

Abhishek Roy, Krishnakumar Balasubramanian, Saeed Ghadimi, Prasant Mohapatra

Keywords Paper

0

0

0

0

3:19

02/02/2021

Combinatorial Pure Exploration with Full-Bandit or Partial Linear Feedback

Yihan Du, Yuko Kuroki, Wei Chen

Keywords Paper

0

0

0

0

17:13

06/12/2020

Learning Feature Sparse Principal Subspace

Lai Tian, Feiping Nie, Rong Wang, Xuelong Li

Keywords Paper

0

0

0

0

3:13

06/12/2021

An Improved Analysis and Rates for Variance Reduction under Without-replacement Sampling Orders

Xinmeng Huang, Kun Yuan, Xianghui Mao, Wotao Yin

Keywords Paper

optimization

0

0

0

0

14:39

26/04/2020

A Stochastic Derivative Free Optimization Method with Momentum

Eduard Gorbunov, Adel Bibi, Ozan Sener and
El Houcine Bergou, Peter Richtarik

Keywords Paper

derivative-free optimization, stochastic optimization, heavy ball momentum, importance sampling

0

0

0

0

4:51

03/05/2021

Effective Distributed Learning with Random Features: Improved Bounds and Algorithms

Yong Liu, Jiankun Liu, Shuqiang Wang

Keywords Paper

statistical learning theory, kernel methods, Risk bound

0

0

0

0

4:25

06/12/2021

ReLU Regression with Massart Noise

Ilias Diakonikolas, Jong Ho Park, Christos Tzamos

Keywords Paper

0

0

0

0

11:59

26/08/2020

Distributionally Robust Formulation and Model Selection for the Graphical Lasso

Pedro Cisneros, Alexander Petersen, Sang-Yun Oh

Keywords Paper

0

0

0

0

14:08

12/07/2020

Efficiently sampling functions from Gaussian process posteriors

James Wilson, Viacheslav Borovitskiy, Alexander Terenin and
Peter Mostowsky, Marc Deisenroth

Keywords Paper

Gaussian Processes

0

0

0

0

14:40