Authors: Kai Arulkumaran,Nat Dilokthanakul,Murray Shanahan,Anil Anthony Bharath
ArXiv: 1604.08153
Document:
PDF
DOI
Abstract URL: http://arxiv.org/abs/1604.08153v3
In this paper we combine one method for hierarchical reinforcement learning -
the options framework - with deep Q-networks (DQNs) through the use of
different "option heads" on the policy network, and a supervisory network for
choosing between the different options. We utilise our setup to investigate the
effects of architectural constraints in subtasks with positive and negative
transfer, across a range of network capacities. We empirically show that our
augmented DQN has lower sample complexity when simultaneously learning subtasks
with negative transfer, without degrading performance when learning subtasks
with positive transfer.