Theory and Practice of Reinforcement Learning 2
The goal of this proposal is to prolong our previous grant "Theory and Practice of Reinforcement Learning (RL) & Search". The work funded by this extraordinarily successful project so far led to over 40 publications, some of which won best paper awards. RL is closely related to how animals and humans act and learn. Without a teacher, solely from occasional real-valued pain and pleasure signals, RL agents must discover how to interact with a dynamic environment to maximize thei future expected reward. The classical approach to RL makes unrealistic strong assumptions such as: the current input of the agent tells it all needs to know about the environment. Our more general methods learn to create memories of important events, solving numerous RL/optimization tasks unsolvable by classical RL methods. Project subgoals include: (1) further improve our state-of-the-art policy gradient techniques and Natural Stochastic Search Strategies, (2) improve adaptive memory-based controllers on RL problems involving high-dimensional state descriptions (HDSD), (3) develop multi-modular RL methods, (4) combine unsupervised sequence encoders for dimensionality reduction with RL, (5) use vision-based RL benchmarks to test our methods, (6) devise theoretically optimal ways of maximizing information gain of exploring RL agents exploring their environment.