Understanding the Effect of Stochasticity in Policy Optimization

06/12/2021

Understanding the Effect of Stochasticity in Policy Optimization

Jincheng Mei, Bo Dai, Chenjun Xiao, Csaba Szepesvari, Dale Schuurmans

Keywords: theory, optimization, reinforcement learning and planning

Abstract Paper Similar Papers

Abstract: We study the effect of stochasticity in on-policy policy optimization, and make the following four contributions. \emph{First}, we show that the preferability of optimization methods depends critically on whether stochastic versus exact gradients are used. In particular, unlike the true gradient setting, geometric information cannot be easily exploited in the stochastic case for accelerating policy optimization without detrimental consequences or impractical assumptions. \emph{Second}, to explain these findings we introduce the concept of committal rate for stochastic policy optimization, and show that this can serve as a criterion for determining almost sure convergence to global optimality. \emph{Third}, we show that in the absence of external oracle information, which allows an algorithm to determine the difference between optimal and sub-optimal actions given only on-policy samples, there is an inherent trade-off between exploiting geometry to accelerate convergence versus achieving optimality almost surely. That is, an uninformed algorithm either converges to a globally optimal policy with probability $1$ but at a rate no better than $O(1/t)$, or it achieves faster than $O(1/t)$ convergence but then must fail to converge to the globally optimal policy with some positive probability. \emph{Finally}, we use the committal rate theory to explain why practical policy optimization methods are sensitive to random initialization, then develop an ensemble method that can be guaranteed to achieve near-optimal solutions with high probability.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Non-convex Distributionally Robust Optimization: Non-asymptotic Analysis

Jikai Jin, Bohang Zhang, Haiyang Wang, Liwei Wang

Keywords Paper

optimization

0

0

0

0

14:05

06/12/2021

Reusing Combinatorial Structure: Faster Iterative Projections over Submodular Base Polytopes

Jai Moondra, Hassan Mortagy, Swati Gupta

Keywords Paper

optimization, online learning

0

0

0

0

15:03

06/12/2021

On the Bias-Variance-Cost Tradeoff of Stochastic Optimization

Yifan Hu, Xin Chen, Niao He

Keywords Paper

theory, optimization, machine learning

0

0

0

0

14:56

06/12/2021

Three Operator Splitting with Subgradients, Stochastic Gradients, and Adaptive Learning Rates

Alp Yurtsever, Alex Gu, Suvrit Sra

Keywords Paper

optimization, machine learning

0

0

0

0

14:21

02/02/2021

Robust Finite-State Controllers for Uncertain POMDPs

Murat Cubuktepe, Nils Jansen, Sebastian Junges and
Ahmadreza Marandi, Marnix Suilen, Ufuk Topcu

Keywords Paper

0

0

0

0

16:50

06/12/2021

Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses

Haipeng Luo, Chen-Yu Wei, Chung-Wei Lee

Keywords Paper

optimization, reinforcement learning and planning, bandits

0

0

0

0

15:17

12/07/2020

The Complexity of Finding Stationary Points with Stochastic Gradient Descent

Yoel Drori, Ohad Shamir

Keywords Paper

Optimization - Non-convex

0

0

0

0

9:09

09/07/2020

Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations

Yossi Arjevani, Yair Carmon, John Duchi and
Dylan Foster, Ayush Sekhari, Karthik Sridharan

Keywords Paper

Non-convex optimization, Stochastic optimization

0

0

0

0

11:57

06/12/2020

Adapting to Misspecification in Contextual Bandits

Dylan Foster, Claudio Gentile, Mehryar Mohri, Julian Zimmert

Keywords Paper

0

0

0

0

3:05

06/12/2020

One Ring to Rule Them All: Certifiably Robust Geometric Perception with Outliers

Heng Yang, Luca Carlone

Keywords Paper

0

0

0

0

3:24

09/07/2020

Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes

Alekh Agarwal, Sham Kakade, Jason Lee, Gaurav Mahajan

Keywords Paper

Reinforcement learning, Non-convex optimization

0

0

0

0

11:00

06/12/2021

Regret Bounds for Gaussian-Process Optimization in Large Domains

Manuel Wuethrich, Bernhard Schölkopf, Andreas Krause

Keywords Paper

optimization, bandits, kernel methods

0

0

0

0

13:02

18/07/2021

Leveraging Non-uniformity in First-order Non-convex Optimization

Jincheng Mei, Yue Gao, Bo Dai and
Csaba Szepesvari, Dale Schuurmans

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

4:49

06/12/2020

Projection Robust Wasserstein Distance and Riemannian Optimization

Darren Lin, Chenyou Fan, Nhat Ho and
Marco Cuturi, Michael Jordan

Keywords Paper

Optimization -> Non-Convex Optimization; Optimization -> Stochastic Optimization, Deep Learning -> Optimization for Deep Networks

0

0

0

1

3:01

06/12/2020

Escaping the Gravitational Pull of Softmax

Jincheng Mei, Chenjun Xiao, Bo Dai and
Lihong Li, Csaba Szepesvari, Dale Schuurmans

Keywords Paper

0

0

0

0

3:27

06/12/2020

Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems

Luo Luo, Haishan Ye, Zhichao Huang, Tong Zhang

Keywords Paper

0

0

0

0

2:00

12/07/2020

On Differentially Private Stochastic Convex Optimization with Heavy-tailed Data

Di Wang, Hanshen Xiao, Srinivas Devadas, Jinhui Xu

Keywords Paper

Privacy-preserving Statistics and Machine Learning

0

0

0

0

15:52

18/07/2021

Beyond Variance Reduction: Understanding the True Impact of Baselines on Policy Optimization

Wes Chung, Valentin Thomas, Marlos C. Machado, Nicolas Le Roux

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:13

03/05/2021

Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and Dual Bounds

Yihao Feng, Ziyang Tang, Na Zhang, Qiang Liu

Keywords Paper

Reinforcement Learnings, Off Policy Evaluation, Non-asymptotic Confidence Intervals

0

0

0

0

4:26

03/05/2021

Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry

Ziyi Chen, Yi Zhou, Tengyu Xu, Yingbin Liang

Keywords Paper

minimax, variable convergence, proximal gradient descent-ascent, nonconvex, Kurdyka-Łojasiewicz geometry

0

0

0

0

5:10

18/07/2021

Regularized Submodular Maximization at Scale

Ehsan Kazemi, shervin minaee, Moran Feldman, Amin Karbasi

Keywords Paper

Optimization, Combinatorial Optimization

0

0

0

0

5:17

03/08/2020

Zeroth Order Non-convex optimization with Dueling-Choice Bandits

Yichong Xu, Aparna Joshi, Aarti Singh, Artur Dubrawski

Keywords Paper

0

0

0

0

8:35

12/07/2020

Random Hypervolume Scalarizations for Provable Multi-Objective Black Box Optimization

Richard Zhang, Daniel Golovin

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

16:39

06/12/2021

Rectangular Flows for Manifold Learning

Anthony Caterini, Gabriel Loaiza-Ganem, Geoff Pleiss, John Cunningham

Keywords Paper

deep learning, optimization, generative model

0

0

0

0

12:26

06/12/2021

A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum

Prashant Khanduri, Siliang Zeng, Mingyi Hong and
Hoi-To Wai, Zhaoran Wang, Zhuoran Yang

Keywords Paper

optimization

0

0

0

0

9:47

02/02/2021

Improved Penalty Method via Doubly Stochastic Gradients for Bilevel Hyperparameter Optimization

Wanli Shi, Bin Gu

Keywords Paper

0

0

0

0

14:47

06/12/2021

Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning

Gen Li, Laixi Shi, Yuxin Chen and
Yuantao Gu, Yuejie Chi

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

15:32

06/12/2020

Large-Scale Methods for Distributionally Robust Optimization

Daniel Levy, Yair Carmon, John Duchi, Aaron Sidford

Keywords Paper

0

0

0

0

3:11

12/07/2020

Stronger and Faster Wasserstein Adversarial Attacks

Kaiwen Wu, Allen Wang, Yaoliang Yu

Keywords Paper

Adversarial Examples

0

0

0

0

14:56

12/07/2020

Optimization from Structured Samples for Coverage Functions

Wei Chen, Xiaoming Sun, Jialin Zhang, Zhijie Zhang

Keywords Paper

Optimization - General

0

0

0

0

14:22

06/12/2020

An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

Julian Katz-Samuels, Lalit Jain, zohar karnin, Kevin Jamieson

Keywords Paper

0

0

0

0

3:20

18/07/2021

Provably Correct Optimization and Exploration with Non-linear Policies

Fei Feng, Wotao Yin, Alekh Agarwal, Lin Yang

Keywords Paper

Deep Learning, Adversarial Networks, Applications, Fairness, Accountability, and Transparency, Theory, RL, Decisions and Control Theory

0

0

0

0

5:03

06/12/2020

Geometric Exploration for Online Control

Orestis Plevrakis, Elad Hazan

Keywords Paper

0

0

0

0

3:21

18/07/2021

Dueling Convex Optimization

Aadirupa Saha, Tomer Koren, Yishay Mansour

Keywords Paper

Algorithms, Multitask and Transfer Learning, Algorithms, Ranking and Preference Learning, Algorithms, Classification

0

0

0

0

6:19

04/08/2021

SGD Generalizes Better Than GD (And Regularization Doesn't Help)

Idan Amir, Tomer Koren, Roi Livni

Keywords Paper

0

0

0

0

15:53

06/12/2021

High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails

Ashok Cutkosky, Harsh Mehta

Keywords Paper

deep learning, optimization

0

0

0

0

20:14

13/04/2021

Convergence properties of stochastic hypergradients

Riccardo Grazzi, Massimiliano Pontil, Saverio Salzo

Keywords Paper

0

0

0

0

3:11

06/12/2021

Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators

Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, Karthikeyan Shanmugam

Keywords Paper

0

0

0

0

14:56

18/07/2021

Private Stochastic Convex Optimization: Optimal Rates in L1 Geometry

Hilal Asi, Vitaly Feldman, Tomer Koren, Kunal Talwar

Keywords Paper

Deep Learning, Algorithms, Multitask and Transfer Learning; Algorithms, Online Learning, Social Aspects of Machine Learning, Privacy, Anonymity, and Security

0

0

0

0

17:27

04/08/2021

Adaptivity in Adaptive Submodularity

Hossein Esfandiari, Amin Karbasi, Vahab Mirrokni

Keywords Paper

0

0

0

0

13:54