Deep Synoptic Monte-Carlo Planning in Reconnaissance Blind Chess

06/12/2021

Deep Synoptic Monte-Carlo Planning in Reconnaissance Blind Chess

Gregory Clark

Keywords: deep learning, reinforcement learning and planning, bandits

Abstract Paper Similar Papers

Abstract: This paper introduces deep synoptic Monte Carlo planning (DSMCP) for large imperfect information games. The algorithm constructs a belief state with an unweighted particle filter and plans via playouts that start at samples drawn from the belief state. The algorithm accounts for uncertainty by performing inference on "synopses," a novel stochastic abstraction of information states. DSMCP is the basis of the program Penumbra, which won the official 2020 reconnaissance blind chess competition versus 33 other programs. This paper also evaluates algorithm variants that incorporate caution, paranoia, and a novel bandit algorithm. Furthermore, it audits the synopsis features used in Penumbra with per-bit saliency statistics.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Adaptive Online Packing-guided Search for POMDPs

Chenyang Wu, Guoyu Yang, Zongzhang Zhang and
Yang Yu, Dong Li, Wulong Liu, Jianye Hao

Keywords Paper

0

0

0

0

13:30

12/07/2020

OPtions as REsponses: Grounding behavioural hierarchies in multi-agent reinforcement learning

Alexander Vezhnevets, Yuhuai Wu, Maria Eckstein and
Rémi Leblond, Joel Z Leibo

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

14:17

06/12/2020

A Game Theoretic Analysis of Additive Adversarial Attacks and Defenses

Ambar Pal, Rene Vidal

Keywords Paper

0

0

0

0

3:19

02/02/2021

Estimating α-Rank by Maximizing Information Gain

Tabish Rashid, Cheng Zhang, Kamil Ciosek

Keywords Paper

0

0

0

0

14:52

06/12/2021

Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games

Yu Bai, Chi Jin, Huan Wang, Caiming Xiong

Keywords Paper

theory, reinforcement learning and planning, bandits

0

0

0

0

12:14

06/12/2020

Learning Strategy-Aware Linear Classifiers

Yiling Chen, Yang Liu, Chara Podimata

Keywords Paper

0

0

0

0

3:15

06/12/2021

Continuous Mean-Covariance Bandits

Yihan Du, Siwei Wang, Zhixuan Fang, Longbo Huang

Keywords Paper

bandits

0

0

0

0

11:33

06/12/2020

Preference-based Reinforcement Learning with Finite-Time Guarantees

Yichong Xu, Ruosong Wang, Lin Yang and
Aarti Singh, Artur Dubrawski

Keywords Paper

0

0

0

0

3:04

13/04/2021

Neural enhanced belief propagation on factor graphs

Víctor Garcia Satorras, Max Welling

Keywords Paper

0

0

0

0

3:14

12/07/2020

Information Particle Filter Tree: An Online Algorithm for POMDPs with Belief-Based Rewards on Continuous Domains

Johannes Fischer, Ömer Sahin Tas

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

14:44

02/02/2021

Bayesian Distributional Policy Gradients

Luchen Li, A. Aldo Faisal

Keywords Paper

1

0

0

0

18:06

19/04/2021

An empirical study on the generalization power of neural representations learned via visual guessing games

Alessandro Suglia, Yonatan Bisk, Ioannis Konstas and
Antonio Vergari, Emanuele Bastianelli, Andrea Vanzo, Oliver Lemon

Keywords Paper

0

0

0

0

7:16

06/12/2021

Neural Auto-Curricula in Two-Player Zero-Sum Games

Xidong Feng, Oliver Slumbers, Ziyu Wan and
Bo Liu, Stephen McAleer, Ying Wen, Jun Wang, Yaodong Yang

Keywords Paper

deep learning, optimization, reinforcement learning and planning, meta learning

0

0

0

0

14:46

02/02/2021

Convergence Analysis of No-Regret Bidding Algorithms in Repeated Auctions

Zhe Feng, Guru Guruganesh, Christopher Liaw and
Aranyak Mehta, Abhishek Sethi

Keywords Paper

0

0

0

0

20:14

03/05/2021

Chaos of Learning Beyond Zero-sum and Coordination via Game Decompositions

Yun Kuen Cheung, Yixin Tao

Keywords Paper

Dynamical Systems, Volume Analysis, Follow-the-Regularized-Leader, Multiplicative Weights Update, Game Decomposition, Lyapunov Chaos, Learning in Games

0

0

0

0

3:53

19/08/2021

Combining Tree Search and Action Prediction for State-of-the-Art Performance in DouDiZhu

Yunsheng Zhang, Dong Yan, Bei Shi and
Haobo Fu, Qiang Fu, Hang Su, Jun Zhu, Ning Chen

Keywords Paper

Machine Learning, Reinforcement Learning, Game Playing and Machine Learning

0

0

0

0

12:03

12/07/2020

Influence Diagram Bandits

Tong Yu, Branislav Kveton, Zheng Wen and
Ruiyi Zhang, Ole J. Mengshoel

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

14:14

02/02/2021

Sequential Generative Exploration Model for Partially Observable Reinforcement Learning

Haiyan Yin, Jianda Chen, Sinno Jialin Pan, Sebastian Tschiatschek

Keywords Paper

0

0

0

0

14:40

13/04/2021

Local competition and stochasticity for adversarial robustness in deep learning

Konstantinos Panousis, Sotirios Chatzis, Antonios Alexos, Sergios Theodoridis

Keywords Paper

0

0

0

0

3:21

06/12/2021

Regime Switching Bandits

Xiang Zhou, Yi Xiong, Ningyuan Chen, Xuefeng GAO

Keywords Paper

reinforcement learning and planning, bandits, online learning

0

0

0

0

13:47

13/04/2021

Reinforcement learning for mean field games with strategic complementarities

Kiyeob Lee, Desik Rengarajan, Dileep Kalathil, Srinivas Shakkottai

Keywords Paper

0

0

0

0

2:57

03/05/2021

Provable Rich Observation Reinforcement Learning with Combinatorial Latent States

Dipendra Misra, Qinghua Liu, Chi Jin, John Langford

Keywords Paper

Factored MDP, State abstraction, Noise-contrastive learning, Rich observation, Reinforcement learning theory

0

0

0

0

5:08

18/07/2021

Post-selection inference with HSIC-Lasso

Tobias Freidling, Benjamin Poignard, Héctor Climente-González, Makoto Yamada

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:03

06/12/2020

Latent Bandits Revisited

Joey Hong, Branislav Kveton, Manzil Zaheer and
Yinlam Chow, Amr Ahmed, Craig Boutilier

Keywords Paper

0

0

0

0

3:11

22/11/2021

Updated Paired Regions for Shadow Detection from Single Image

Xiao Wang, Siyuan Yao, Pengwen Dai and
Rui Wang (State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, School of Cyber Security, University of Chinese Academy of Sciences), Xiaochun Cao

Keywords Paper

Shadow detection, paired regions, penumbra region, physical confidence coefficients

0

0

0

0

2:20

03/05/2021

Set Prediction without Imposing Structure as Conditional Density Estimation

David W Zhang, Gertjan J Burghouts, Cees G Snoek

Keywords Paper

energy based models, set prediction

0

0

0

0

5:02

06/12/2021

Learning Stochastic Majority Votes by Minimizing a PAC-Bayes Generalization Bound

Valentina Zantedeschi, Paul Viallard, Emilie Morvant and
Rémi Emonet, Amaury Habrard, Pascal Germain, Benjamin Guedj

Keywords Paper

0

0

0

0

12:31

06/12/2020

An implicit function learning approach for parametric modal regression

Yangchen Pan, Ehsan Imani, Amir-massoud Farahmand, Martha White

Keywords Paper

0

0

0

0

3:09

03/05/2021

Calibration tests beyond classification

David Widmann, Fredrik Lindsten, Dave Zachariah

Keywords Paper

uncertainty quantification, maximum mean discrepancy, integral probability metric, framework, calibration

0

0

0

0

6:05

06/12/2020

Robust Multi-Agent Reinforcement Learning with Model Uncertainty

Kaiqing Zhang, TAO SUN, Yunzhe Tao and
Sahika Genc, Sunil Mallya, Tamer Basar

Keywords Paper

0

0

0

0

3:11

20/07/2020

Deep Fictitious Play for Finding Markovian Nash Equilibrium in Multi-Agent Games

Jiequn Han, Ruimeng Hu

Keywords Paper

0

0

0

0

16:35

03/05/2021

C-Learning: Learning to Achieve Goals via Recursive Classification

Ben Eysenbach, Ruslan Salakhutdinov, Sergey Levine

Keywords Paper

reinforcement learning, goal reaching, density estimation, hindsight relabeling, Q-learning

0

0

0

0

5:09

19/08/2021

Learning Generalized Unsolvability Heuristics for Classical Planning

Simon Ståhlberg, Guillem Francès, Jendrik Seipp

Keywords Paper

Planning and Scheduling, Planning and Scheduling

0

0

0

0

12:33

06/12/2021

Learning in two-player zero-sum partially observable Markov games with perfect recall

Tadashi Kozuno, Pierre Ménard, Remi Munos, Michal Valko

Keywords Paper

reinforcement learning and planning, bandits, online learning

0

0

0

0

9:31

25/07/2020

Asymmetric tri-training for debiasing missing-not-at-random explicit feedback

Yuta Saito

Keywords Paper

recommender systems, unsupervised domain adaptation, missing-not-at-random, matrix factorization, selection bias, explicit feedback

0

0

0

0

18:03

26/04/2020

Double Neural Counterfactual Regret Minimization

Hui Li, Kailiang Hu, Shaohua Zhang and
Yuan Qi, Le Song

Keywords Paper

Counterfactual Regret Minimization, Imperfect Information game, Neural Strategy, Deep Learning, Robust Sampling

0

0

0

0

4:49

06/12/2021

Conservative Offline Distributional Reinforcement Learning

Yecheng Ma, Dinesh Jayaraman, Osbert Bastani

Keywords Paper

reinforcement learning and planning

1

0

0

0

13:54

18/07/2021

From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization

Julien Perolat, Remi Munos, Jean-Baptiste Lespiau and
Shayegan Omidshafiei, Mark Rowland, Pedro Ortega, Neil Burch, Thomas Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls

Keywords Paper

Probabilistic Methods, Causal Inference, Reinforcement Learning and Planning, Multi-Agent RL, Probabilistic Methods, Graphical Models

0

0

0

0

5:24

26/04/2020

Discriminative Particle Filter Reinforcement Learning for Complex Partial observations

Xiao Ma, Peter Karkus, David Hsu and
Wee Sun Lee, Nan Ye

Keywords Paper

Reinforcement Learning, Partial Observability, Differentiable Particle Filtering

0

0

0

0

5:08

06/12/2021

Exploiting Opponents Under Utility Constraints in Sequential Games

Martino Bernasconi-de-Luca, Federico Cacciamani, Simone Fioravanti and
Nicola Gatti, Alberto Marchesi, Francesco Trovò

Keywords Paper

online learning

0

0

0

0

12:59