We are very excited to join forces with MLCommons and OctoML.ai! Contact Grigori Fursin for more details!

GPU Memory Management for Deep Neural Networks Using Deep Q-Network

lib:da111b4f3db30bea (v1.0.0)

Authors: Anonymous
Where published: ICLR 2020 1
Document:  PDF  DOI 
Abstract URL: https://openreview.net/forum?id=BJxg7eHYvB

Deep neural networks use deeper and broader structures to achieve better performance and consequently, use increasingly more GPU memory as well. However, limited GPU memory restricts many potential designs of neural networks. In this paper, we propose a reinforcement learning based variable swapping and recomputation algorithm to reduce the memory cost, without sacrificing the accuracy of models. Variable swapping can transfer variables between CPU and GPU memory to reduce variables stored in GPU memory. Recomputation can trade time for space by removing some feature maps during forward propagation. Forward functions are executed once again to get the feature maps before reuse. However, how to automatically decide which variables to be swapped or recomputed remains a challenging problem. To address this issue, we propose to use a deep Q-network(DQN) to make plans. By combining variable swapping and recomputation, our results outperform several well-known benchmarks.

Relevant initiatives  

Related knowledge about this paper Reproduced results (crowd-benchmarking and competitions) Artifact and reproducibility checklists Common formats for research projects and shared artifacts Reproducibility initiatives


Please log in to add your comments!
If you notice any inapropriate content that should not be here, please report us as soon as possible and we will try to remove it within 48 hours!