Geometry-Aware Gradient Algorithms for Neural Architecture Search

03/05/2021

Geometry-Aware Gradient Algorithms for Neural Architecture Search

Liam Li, Misha Khodak, Nina Balcan, Ameet Talwalkar

Keywords: weight-sharing, neural architecture search, optimization, automated machine learning

Abstract Paper Similar Papers

Abstract: Recent state-of-the-art methods for neural architecture search (NAS) exploit gradient-based optimization by relaxing the problem into continuous optimization over architectures and shared-weights, a noisy process that remains poorly understood. We argue for the study of single-level empirical risk minimization to understand NAS with weight-sharing, reducing the design of NAS methods to devising optimizers and regularizers that can quickly obtain high-quality solutions to this problem. Invoking the theory of mirror descent, we present a geometry-aware framework that exploits the underlying structure of this optimization to return sparse architectural parameters, leading to simple yet novel algorithms that enjoy fast convergence guarantees and achieve state-of-the-art accuracy on the latest NAS benchmarks in computer vision. Notably, we exceed the best published results for both CIFAR and ImageNet on both the DARTS search space and NAS-Bench-201; on the latter we achieve near-oracle-optimal performance on CIFAR-10 and CIFAR-100. Together, our theory and experiments demonstrate a principled way to co-design optimizers and continuous relaxations of discrete NAS search spaces.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

iDARTS: Differentiable Architecture Search with Stochastic Implicit Gradients

Miao Zhang, Steven Su, Shirui Pan and
Xiaojun Chang, Mohammad Abbasnejad, Reza Haffari

Keywords Paper

Algorithms, AutoML

0

0

0

0

5:19

06/12/2020

Differentiable Neural Architecture Search in Equivalent Space with Exploration Enhancement

Miao Zhang, Huiqi Li, Shirui Pan and
Xiaojun Chang, Zongyuan Ge, Steven Su

Keywords Paper

0

0

0

0

3:22

26/04/2020

Short and Sparse Deconvolution --- A Geometric Approach

Yenson Lau, Qing Qu, Han-Wen Kuo and
Pengcheng Zhou, Yuqian Zhang, John Wright

Keywords Paper

0

0

0

0

7:18

03/05/2021

Byzantine-Resilient Non-Convex Stochastic Gradient Descent

Zeyuan Allen-Zhu, Faeze Ebrahimianghazani, Jerry Li, Dan Alistarh

Keywords Paper

Byzantine resilience, robust deep learning, distributed deep learning, distributed machine learning, non-convex optimization

0

0

0

0

6:16

03/05/2021

DrNAS: Dirichlet Neural Architecture Search

Xiangning Chen, Ruochen Wang, Minhao Cheng and
Xiaocheng Tang, Cho-Jui Hsieh

Keywords Paper

0

0

0

0

5:00

14/06/2020

Quasi-Newton Solver for Robust Non-Rigid Registration

Yuxin Yao, Bailin Deng, Weiwei Xu, Juyong Zhang

Keywords Paper

non-rigid registration, robust estimator, quasi-newton, welsch's function, mm algorithm, l-bfgs, deformation graph.

0

0

0

0

4:56

14/06/2020

A Graduated Filter Method for Large Scale Robust Estimation

Huu Le, Christopher Zach

Keywords Paper

robust fitting, bundle adjustment, non-convex, poor local minima, non-linear least squares, graduated non-convexity.

0

0

0

0

1:01

12/07/2020

Inducing and Exploiting Activation Sparsity for Fast Inference on Deep Neural Networks

Mark Kurtz, Justin Kopinsky, Rati Gelashvili and
Alexander Matveev, John Carr, Michael Goin, William Leiserson, Sage Moore, Nir Shavit, Dan Alistarh

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

14:41

06/12/2021

Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination

Dylan J Foster, Akshay Krishnamurthy

Keywords Paper

theory, reinforcement learning and planning, bandits, online learning

0

0

0

0

19:34

18/07/2021

Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm

Mingkang Zhu, Tianlong Chen, Zhangyang Wang

Keywords Paper

Deep Learning, Generative Models, Data, Challenges, Implementations, and Software, Benchmarks, Algorithms, Adversarial Examples

0

0

0

0

18:17

06/12/2021

Progressive Feature Interaction Search for Deep Sparse Network

Chen Gao, Yinfeng Li, Quanming Yao and
Depeng Jin, Yong Li

Keywords Paper

deep learning, machine learning

1

0

0

0

14:01

26/08/2020

ASAP: Architecture Search, Anneal and Prune

Asaf Noy, Niv Nayman, Tal Ridnik and
Nadav Zamir, Sivan Doveh, Itamar Friedman, Raja Giryes, Lihi Zelnik

Keywords Paper

0

0

0

0

11:59

12/07/2020

A Unified Theory of Decentralized SGD with Changing Topology and Local Updates

Anastasiia Koloskova, Nicolas Loizou, Sadra Boreiri and
Martin Jaggi, Sebastian Stich

Keywords Paper

Optimization - Large Scale, Parallel and Distributed

0

0

0

0

13:46

22/11/2021

Noisy Differentiable Architecture Search

Xiangxiang Chu, Bo Zhang

Keywords Paper

Neural architecture search, AutoML

0

0

0

0

2:30

22/11/2021

Variance-stationary Differentiable NAS

Hyeokjun Choe, Byunggook Na, Jisoo Mok, Sungroh Yoon

Keywords Paper

darts, differentiable nas, one-shot nas, neural architecture search, nas, architecture parameter, automl, vs-darts, vsdarts, variance-stationary

0

0

0

0

3:08

06/12/2021

Asynchronous Decentralized SGD with Quantized and Local Updates

Giorgi Nadiradze, Amirmojtaba Sabour, Peter Davies and
Shigang Li, Dan Alistarh

Keywords Paper

optimization, machine learning, graph learning

0

0

0

0

12:37

12/07/2020

dS^2LBI: Exploring Structural Sparsity on Deep Network via Differential Inclusion Paths

Yanwei Fu, Chen Liu, Donghao Li and
Xinwei Sun, Jinshan ZENG, Yuan Yao

Keywords Paper

Deep Learning - Algorithms

0

0

0

1

12:45

02/02/2021

Automated Clustering of High-dimensional Data with a Feature Weighted Mean Shift Algorithm

Saptarshi Chakraborty, Debolina Paul, Swagatam Das

Keywords Paper

0

0

0

0

20:09

03/05/2021

Rethinking Architecture Selection in Differentiable NAS

Ruochen Wang, Minhao Cheng, Xiangning Chen and
Xiaocheng Tang, Cho-Jui Hsieh

Keywords Paper

0

0

0

0

17:22

14/06/2020

Rethinking Differentiable Search for Mixed-Precision Neural Networks

Zhaowei Cai, Nuno Vasconcelos

Keywords Paper

mixed-precision network, bit allocation, differentiable, architecture search

0

0

0

0

1:01

19/08/2021

Discrete Multiple Kernel k-means

Rong Wang, Jitao Lu, Yihang Lu and
Feiping Nie, Xuelong Li

Keywords Paper

Machine Learning, Clustering, Kernel Methods, Multi-instance; Multi-label; Multi-view learning

0

0

0

0

15:04

06/12/2020

An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

Julian Katz-Samuels, Lalit Jain, zohar karnin, Kevin Jamieson

Keywords Paper

0

0

0

0

3:20

06/12/2020

Hybrid Variance-Reduced SGD Algorithms For Minimax Problems with Nonconvex-Linear Function

Quoc Tran Dinh, Deyi Liu, Lam Nguyen

Keywords Paper

0

0

0

0

3:07

06/12/2020

ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding

Yibo Yang, Hongyang Li, Shan You and
Fei Wang, Chen Qian, Zhouchen Lin

Keywords Paper

0

0

0

0

3:19

03/05/2021

MetaNorm: Learning to Normalize Few-Shot Batches Across Domains

Yingjun Du, Xiantong Zhen, Ling Shao, Cees G Snoek

Keywords Paper

batch normalization, Meta-learning, few-shot domain generalization

0

0

0

0

5:48

12/07/2020

The continuous categorical: a novel simplex-valued exponential family

Elliott Gordon-Rodriguez, Gabriel Loaiza-Ganem, John Cunningham

Keywords Paper

Probabilistic Inference - Models and Probabilistic Programming

0

0

0

0

14:59

02/02/2021

Post-training Quantization with Multiple Points: Mixed Precision without Mixed Precision

Xingchao Liu, Mao Ye, Dengyong Zhou, Qiang Liu

Keywords Paper

0

0

0

0

15:18

06/12/2020

Hypersolvers: Toward Fast Continuous-Depth Models

Michael Poli, Stefano Massaroli, Atsushi Yamashita and
Hajime Asama, Jinkyoo Park

Keywords Paper

0

0

0

0

3:16

14/06/2020

Recursive Least-Squares Estimator-Aided Online Learning for Visual Tracking

Jin Gao, Weiming Hu, Yan Lu

Keywords Paper

online learning, visual tracking, continual learning, recursive least-squares estimation, deep learning, memory retention, recursive learning, mini-batch sgd, normal equation, mlp layer

0

0

0

0

5:01

06/12/2021

Fast Minimum-norm Adversarial Attacks through Adaptive Norm Constraints

Maura Pintor, Fabio Roli, Wieland Brendel, Battista Biggio

Keywords Paper

optimization, machine learning, robustness, adversarial robustness and security, vision

0

0

0

0

11:35

26/04/2020

MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius

Runtian Zhai, Chen Dan, Di He and
Huan Zhang, Boqing Gong, Pradeep Ravikumar, Cho-Jui Hsieh, Liwei Wang

Keywords Paper

Adversarial Robustness, Provable Adversarial Defense, Randomized Smoothing, Robustness Certification

0

0

0

0

5:10

06/12/2021

Learned Robust PCA: A Scalable Deep Unfolding Approach for High-Dimensional Outlier Detection

HanQin Cai, Jialin Liu, Wotao Yin

Keywords Paper

deep learning, machine learning

0

0

0

0

8:07

06/12/2020

Distributed Training with Heterogeneous Data: Bridging Median- and Mean-Based Algorithms

Xiangyi Chen, Tiancong Chen, Haoran Sun and
Steven Wu, Mingyi Hong

Keywords Paper

0

0

0

0

3:19

06/12/2020

Theory-Inspired Path-Regularized Differential Network Architecture Search

Pan Zhou, Caiming Xiong, Richard Socher, Steven Hoi

Keywords Paper

0

0

0

0

3:18

02/02/2021

Asynchronous Optimization Methods for Efficient Training of Deep Neural Networks with Guarantees

Vyacheslav Kungurtsev, Malcolm Egan, Bapi Chatterjee, Dan Alistarh

Keywords Paper

0

0

0

0

19:56

06/12/2021

Hyperparameter Tuning is All You Need for LISTA

Xiaohan Chen, Jialin Liu, Zhangyang Wang, Wotao Yin

Keywords Paper

deep learning

0

0

0

0

15:05

14/06/2020

On the Acceleration of Deep Learning Model Parallelism With Staleness

An Xu, Zhouyuan Huo, Heng Huang

Keywords Paper

layer-wise staleness, asynchronous model parallelism, convolutional neural networks.

0

0

0

0

1:01

14/09/2020

Squeezing Correlated Neurons for Resource-Efficient Deep Neural Networks

Elbruz Ozen, Alex Orailoglu

Keywords Paper

deep learning, information redundancy, pruning

0

0

0

0

14:48

12/07/2020

Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack

Francesco Croce, Matthias Hein

Keywords Paper

Adversarial Examples

0

0

0

0

15:12

12/07/2020

Sharp Composition Bounds for Gaussian Differential Privacy via Edgeworth Expansion

Qinqing Zheng, Jinshuo Dong, Qi Long, Weijie Su

Keywords Paper

Privacy-preserving Statistics and Machine Learning

0

0

0

0

12:45