Experience sharing hierarchical deep reinforcement learning

1 0
  • 0 Collaborators

New reinforcement learning (RL) model for cooperative multiagent domainsThe goal is to speed up learning by sharing experience between agents, taking advantage of the parallel exploration conducted by cooperative agents in a multiagent setting. It builds upon previous work conducted in the field of transfer learning in RL and multiagent settings. ...learn more

Project status: Under Development

Robotics, Artificial Intelligence

Code Samples [1]

Overview / Usage

This work introduces a new reinforcement learning (RL) model for cooperative multiagent domains. The goal is to speed up learning by sharing experience between agents, taking advantage of the parallel exploration conducted by cooperative agents in a multiagent setting. It builds upon previous work conducted in the field of transfer learning in RL and multiagent settings.

Methodology / Approach

RL represents a learning mechanism used by an agent to optimize its policy, with the objective of maximizing the total amount of reward received from an environment. In a seminal paper in the field of RL, Tan in 1993 poses the question "Given the same number of RL agents, will cooperative agents outperform independent agents who do not communicate during learning?" and "What is the price for such cooperation?".

Tan notices that learning is a core component of human society behavior. Each individual agent does not need to learn from scratch. While learning from experience, it also exchanges knowledge with peers and teachers to accelerate learning. This intuition can be extended to multiagent scenarios with cooperative agents, where agents are either attempting to achieve a common goal or coexisting in the environment while pursuing individual goals.

Algorithms proposed in previous works have mainly focused on environments with discrete or discretized state spaces. Although some real-world problems can be modeled as such, the RL community has recently attempted to solve continuous state space RL problems, that more faithfully represent some application domains such as Robotics.

In this research, we set to answer the questions initially proposed by Tan, but extended to complex domains which have continuous state space, and bare a closer resemblance to real world settings. More specifically, we define our question as: "Is it possible to accelerate learning in cooperative multiagent complex environments by sharing experience among agents, considering the added communication costs involved?"

Technologies Used

Python, Tensorflow, Keras, OpenAI Gym, OpenAI Retro (open to suggestions on technology stack, not fixed)

Repository

https://github.com/lucasosouza

Comments (0)