Can Agents Learn by Analogy? An Inferable Model for PAC Reinforcement Learning

Year
2020
Type(s)
Author(s)
Y. Sun, F. Huang
Source
19th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2020)
Url
http://ifaamas.org/Proceedings/aamas2020/pdfs/p1332.pdf
BibTeX
BibTeX

Model-based reinforcement learning algorithms make decisions by building and utilizing a model of the environment. However, none of the existing algorithms attempts to infer the dynamics of any state-action pair from known state-action pairs before meeting it for sufficient times. We propose a new model-based method called Greedy Inference Model (GIM) that infers the unknown dynamics from known dynamics based on the internal spectral properties of the environment. In other words, GIM can “learn by analogy”. We further introduce a new exploration strategy which ensures that the agent rapidly and evenly visits unknown state-action pairs. GIM is much more computationally efficient than state-of-the-art model-based algorithms, as the number of dynamic programming operations is independent of the environment size. Lower sample complexity could also be achieved under mild conditions compared against methods without inferring. Experimental results demonstrate the effectiveness and efficiency of GIM in a variety of real world tasks.

The code is here, the arxiv version is here.