Misc,

Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

J. Schrittwieser, I. Antonoglou, T. Hubert, K. Simonyan, L. Sifre, S. Schmitt, A. Guez, E. Lockhart, D. Hassabis, T. Graepel, T. Lillicrap, and D. Silver.
(2019)cite arxiv:1911.08265.

Abstract

Constructing agents with planning capabilities has long been one of the main challenges in the pursuit of artificial intelligence. Tree-based planning methods have enjoyed huge success in challenging domains, such as chess and Go, where a perfect simulator is available. However, in real-world problems the dynamics governing the environment are often complex and unknown. In this work we present the MuZero algorithm which, by combining a tree-based search with a learned model, achieves superhuman performance in a range of challenging and visually complex domains, without any knowledge of their underlying dynamics. MuZero learns a model that, when applied iteratively, predicts the quantities most directly relevant to planning: the reward, the action-selection policy, and the value function. When evaluated on 57 different Atari games - the canonical video game environment for testing AI techniques, in which model-based planning approaches have historically struggled - our new algorithm achieved a new state of the art. When evaluated on Go, chess and shogi, without any knowledge of the game rules, MuZero matched the superhuman performance of the AlphaZero algorithm that was supplied with the game rules.

BibTeX key: schrittwieser2019mastering
entry type: misc
year: 2019
url: http://arxiv.org/abs/1911.08265
note: cite arxiv:1911.08265

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@misc{schrittwieser2019mastering, abstract = {Constructing agents with planning capabilities has long been one of the main challenges in the pursuit of artificial intelligence. Tree-based planning methods have enjoyed huge success in challenging domains, such as chess and Go, where a perfect simulator is available. However, in real-world problems the dynamics governing the environment are often complex and unknown. In this work we present the MuZero algorithm which, by combining a tree-based search with a learned model, achieves superhuman performance in a range of challenging and visually complex domains, without any knowledge of their underlying dynamics. MuZero learns a model that, when applied iteratively, predicts the quantities most directly relevant to planning: the reward, the action-selection policy, and the value function. When evaluated on 57 different Atari games - the canonical video game environment for testing AI techniques, in which model-based planning approaches have historically struggled - our new algorithm achieved a new state of the art. When evaluated on Go, chess and shogi, without any knowledge of the game rules, MuZero matched the superhuman performance of the AlphaZero algorithm that was supplied with the game rules.}, added-at = {2020-12-29T16:42:13.000+0100}, author = {Schrittwieser, Julian and Antonoglou, Ioannis and Hubert, Thomas and Simonyan, Karen and Sifre, Laurent and Schmitt, Simon and Guez, Arthur and Lockhart, Edward and Hassabis, Demis and Graepel, Thore and Lillicrap, Timothy and Silver, David}, biburl = {https://www.bibsonomy.org/bibtex/2efec1252d6981140f202f6d26461b4a2/louissf}, description = {Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model}, interhash = {c1d97396ec14c7a6905b23b46a69b34f}, intrahash = {efec1252d6981140f202f6d26461b4a2}, keywords = {ai chess games learning}, note = {cite arxiv:1911.08265}, timestamp = {2020-12-29T16:42:13.000+0100}, title = {Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model}, url = {http://arxiv.org/abs/1911.08265}, year = 2019 }

BibSonomy

Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on