Data-Efficient Reinforcement Learning for Malaria Control

19/08/2021

Data-Efficient Reinforcement Learning for Malaria Control

Lixin Zou

Keywords: Agent-based and Multi-agent Systems, Human-Agent Interaction, Environmental

Abstract Paper Similar Papers

Abstract: Sequential decision-making under cost-sensitive tasks is prohibitively daunting, especially for the problem that has a significant impact on people's daily lives, such as malaria control, treatment recommendation. The main challenge faced by policymakers is to learn a policy from scratch by interacting with a complex environment in a few trials. This work introduces a practical, data-efficient policy learning method, named Variance-Bonus Monte Carlo Tree Search~(VB-MCTS), which can copy with very little data and facilitate learning from scratch in only a few trials. Specifically, the solution is a model-based reinforcement learning method. To avoid model bias, we apply Gaussian Process~(GP) regression to estimate the transitions explicitly. With the GP world model, we propose a variance-bonus reward to measure the uncertainty about the world. Adding the reward to the planning with MCTS can result in more efficient and effective exploration. Furthermore, the derived polynomial sample complexity indicates that VB-MCTS is sample efficient. Finally, outstanding performance on a competitive world-level RL competition and extensive experimental results verify its advantage over the state-of-the-art on the challenging malaria control task.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at IJCAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Off-Policy Imitation Learning from Observations

Zhuangdi Zhu, Kaixiang Lin, Bo Dai, Jiayu Zhou

Keywords Paper

0

0

0

1

3:24

06/12/2021

Learning Collaborative Policies to Solve NP-hard Routing Problems

Minsu Kim, Jinkyoo Park, joungho kim

Keywords Paper

reinforcement learning and planning

0

0

0

0

15:03

06/12/2021

Continuous Doubly Constrained Batch Reinforcement Learning

Rasool Fakoor, Jonas Mueller, Kavosh Asadi and
Pratik Chaudhari, Alexander J Smola

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:06

18/07/2021

EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL

Seyed Kamyar Seyed Ghasemipour, Dale Schuurmans, Shixiang Gu

Keywords Paper

Reinforcement Learning and Planning

0

0

0

1

5:54

18/07/2021

A Distribution-dependent Analysis of Meta Learning

Mikhail Konobeev, Ilja Kuzborskij, Csaba Szepesvari

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:06

03/05/2021

UPDeT: Universal Multi-agent RL via Policy Decoupling with Transformers

Siyi Hu, Fengda Zhu, Xiaojun Chang, Xiaodan Liang

Keywords Paper

Transfer Learning, Multi-agent Reinforcement Learning

0

0

0

0

2:46

03/05/2021

Blending MPC & Value Function Approximation for Efficient Reinforcement Learning

Mohak Bhardwaj, Sanjiban Choudhury, Byron Boots

Keywords Paper

reinforcement learning, model-predictive control

0

0

0

0

5:09

06/12/2021

Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Making by Reinforcement Learning

Kai Wang, Sanket Shah, Haipeng Chen and
Andrew Perrault, Finale Doshi-Velez, Milind Tambe

Keywords Paper

deep learning, optimization, reinforcement learning and planning

0

0

0

0

14:52

22/09/2020

FISSA: Fusing item similarity models with self-attention networks for sequential recommendation

Jing Lin, Weike Pan, Zhong Ming

Keywords Paper

Item Similarity Models, Sequential Recommendation, Gating Networks, Self-Attention

0

0

0

0

2:06

03/05/2021

Policy-Driven Attack: Learning to Query for Hard-label Black-box Adversarial Examples

Ziang Yan, Yiwen Guo, Jian Liang, Changshui Zhang

Keywords Paper

hard-label attack, adversarial attack, black-box attack, reinforcement learning

0

0

0

0

4:55

13/04/2021

Active learning with maximum margin sparse gaussian processes

Weishi Shi, Qi Yu

Keywords Paper

0

0

0

0

3:00

02/02/2021

Learning Compositional Sparse Gaussian Processes with a Shrinkage Prior

Anh Tong, Toan M Tran, Hung Bui, Jaesik Choi

Keywords Paper

0

0

0

0

18:06

26/04/2020

Ranking Policy Gradient

Kaixiang Lin, Jiayu Zhou

Keywords Paper

Sample-efficient reinforcement learning, off-policy learning.

0

0

0

0

5:43

06/12/2021

Local policy search with Bayesian optimization

Sarah Müller, Alexander von Rohr, Sebastian Trimpe

Keywords Paper

theory, optimization, reinforcement learning and planning, active learning

0

0

0

0

11:42

06/12/2021

Multi-Label Learning with Pairwise Relevance Ordering

Ming-Kun Xie, Sheng-Jun Huang

Keywords Paper

machine learning

0

0

0

0

3:56

26/04/2020

Exploring Model-based Planning with Policy Networks

Tingwu Wang, Jimmy Ba

Keywords Paper

reinforcement learning, model-based reinforcement learning, planning

0

0

0

0

5:01

18/07/2021

Is Pessimism Provably Efficient for Offline RL?

Ying Jin, Zhuoran Yang, Zhaoran Wang

Keywords Paper

Reinforcement Learning and Planning, Others

0

0

0

0

5:17

06/12/2020

Memory Based Trajectory-conditioned Policies for Learning from Sparse Rewards

Yijie Guo, Jongwook Choi, Marcin Moczulski and
Shengyu Feng, Samy Bengio, Mohammad Norouzi, Honglak Lee

Keywords Paper

0

0

1

1

3:30

03/05/2021

Behavioral Cloning from Noisy Demonstrations

Fumihiro Sasaki, Ryota Yamashina

Keywords Paper

Inverse Reinforcement Learning, Noisy Demonstrations, Imitation Learning

0

0

0

0

9:36

06/12/2021

An Information-theoretic Approach to Distribution Shifts

Marco Federici, Ryota Tomioka, Patrick Forré

Keywords Paper

theory, deep learning, machine learning, graph learning, domain adaptation, representation learning

0

0

0

0

9:50

06/12/2021

Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD

Rémi Bardenet, Subhroshekhar Ghosh, Meixia LIN

Keywords Paper

optimization, machine learning

0

0

0

0

14:51

03/05/2021

Batch Reinforcement Learning Through Continuation Method

Yijie Guo, Shengyu Feng, Nicolas Le Roux and
Ed H. Chi, Honglak Lee, Minmin Chen

Keywords Paper

batch reinforcement learning, relaxed regularization, continuation method

1

0

0

0

5:34

02/02/2021

Federated Multi-Armed Bandits

Chengshuai Shi, Cong Shen

Keywords Paper

0

0

0

0

15:26

06/12/2020

Model-based Policy Optimization with Unsupervised Model Adaptation

Jian Shen, Han Zhao, Weinan Zhang, Yong Yu

Keywords Paper

0

0

0

0

3:09

06/12/2021

Personalized Federated Learning With Gaussian Processes

Idan Achituve, Aviv Shamsian, Aviv Navon and
Gal Chechik, Ethan Fetaya

Keywords Paper

deep learning, kernel methods, federated learning

0

0

0

1

7:22

06/12/2020

Conservative Q-Learning for Offline Reinforcement Learning

Aviral Kumar, Aurick Zhou, George Tucker, Sergey Levine

Keywords Paper

Algorithms -> Representation Learning; Algorithms -> Structured Prediction; Applications -> Computational Biology and Bioinform, Deep Learning -> Embedding Approaches

0

0

0

0

3:16

06/12/2020

Task-Agnostic Amortized Inference of Gaussian Process Hyperparameters

Sulin Liu, Xingyuan Sun, Peter J Ramadge, Ryan Adams

Keywords Paper

0

0

0

0

3:46

18/07/2021

Adaptive Sampling for Best Policy Identification in Markov Decision Processes

Aymen Al Marjani, Alexandre Proutiere

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

5:35

06/12/2021

COMBO: Conservative Offline Model-Based Policy Optimization

Tianhe Yu, Aviral Kumar, Rafael Rafailov and
Aravind Rajeswaran, Sergey Levine, Chelsea Finn

Keywords Paper

deep learning, optimization, reinforcement learning and planning

0

0

0

0

12:35

02/02/2021

Query Training: Learning a Worse Model to Infer Better Marginals in Undirected Graphical Models with Hidden Variables

Miguel Lázaro-Gredilla, Wolfgang Lehrach, Nishad Gothoskar and
Guangyao Zhou, Antoine Dedieu, Dileep George

Keywords Paper

0

0

0

0

19:45

02/02/2021

Robust Reinforcement Learning: A Case Study in Linear Quadratic Regulation

Bo Pang, Zhong-Ping Jiang

Keywords Paper

0

0

0

0

20:01

12/07/2020

Batch Reinforcement Learning with Hyperparameter Gradients

Byung-Jun Lee, Jongmin Lee, Peter Vrancx and
Dongho Kim, Kee-Eung Kim

Keywords Paper

Reinforcement Learning - General

0

0

0

0

16:07

18/07/2021

Active Learning of Continuous-time Bayesian Networks through Interventions

Dominik Linzner, Heinz Koeppl

Keywords Paper

Probabilistic Methods, Graphical Models

0

0

0

0

5:07

06/12/2020

Differentiable Meta-Learning of Bandit Policies

Craig Boutilier, Chih-wei Hsu, Branislav Kveton and
Martin Mladenov, Csaba Szepesvari, Manzil Zaheer

Keywords Paper

0

0

0

0

3:10

06/12/2020

PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning

Alekh Agarwal, Mikael Henaff, Sham Kakade, Wen Sun

Keywords Paper

0

0

0

0

3:13

02/02/2021

Solving Common-Payoff Games with Approximate Policy Iteration

Samuel Sokota, Edward Lockhart, Finbarr Timbers and
Elnaz Davoodi, Ryan D'Orazio, Neil Burch, Martin Schmid, Michael Bowling, Marc Lanctot

Keywords Paper

0

0

0

0

18:14

06/12/2020

MOPO: Model-based Offline Policy Optimization

Tianhe (Kevin) Yu, Garrett Thomas, Lantao Yu and
Stefano Ermon, James Zou, Sergey Levine, Chelsea Finn, Tengyu Ma

Keywords Paper

0

0

0

0

3:30

06/12/2020

Functional Regularization for Representation Learning: A Unified Theoretical Perspective

Siddhant Garg, Yingyu Liang

Keywords Paper

0

0

0

0

3:19

02/02/2021

Longitudinal Deep Kernel Gaussian Process Regression

Junjie Liang, Yanting Wu, Dongkuan Xu, Vasant G Honavar

Keywords Paper

0

0

0

0

16:27

12/07/2020

Momentum-Based Policy Gradient Methods

Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

Keywords Paper

Reinforcement Learning - General

0

0

0

0

13:28