Ermo Wei, Drew Wicke and Sean Luke
Learning to coordinate is a hard task for reinforcement learning due to a game-theoretic pathology known as relative overgeneralization. To help deal with this issue, we propose two methods which apply forms of imitation learning to the problem of learning coordinated behaviors. The proposed methods have a close connection to multiagent actor-critic models, and will avoid relative overgeneralization if the right demonstrations are given. We compare our algorithms with MADDPG, a state-of-the-art approach, and show that our methods achieve better coordination in multiagent cooperative tasks.
Hanqing Wang, Wei Liang and Lap-Fai Yu
Given a scene layout like a room or a courtyard composed of objects, it is usually implemented manually due to the complicated state which results in a large searching space for the machine. In this paper, we propose a learning-based approach to program the implementation automatically. Our approach has two components. The main structure of our approach is the Monte Carlo Tree which searches the most valuable move for the current state. A neural network estimates the value for the leaf nodes of the searching tree. With the power of deep reinforcement learning, the network learns how to move the objects through millions of trial and error. We demonstrate our approach on different scenes and compare the performance of our approach with human performance in our experiments.