This portal has been archived. Explore the next generation of this technology.

Parallelizable Stack Long Short-Term Memory

lib:205e4c889975f719 (v1.0.0)

Vote to reproduce this paper and share portable workflows   1 
Authors: Shuoyang Ding,Philipp Koehn
Where published: WS 2019 6
ArXiv: 1904.03409
Document:  PDF  DOI 
Artifact development version: GitHub
Abstract URL: http://arxiv.org/abs/1904.03409v1


Stack Long Short-Term Memory (StackLSTM) is useful for various applications such as parsing and string-to-tree neural machine translation, but it is also known to be notoriously difficult to parallelize for GPU training due to the fact that the computations are dependent on discrete operations. In this paper, we tackle this problem by utilizing state access patterns of StackLSTM to homogenize computations with regard to different discrete operations. Our parsing experiments show that the method scales up almost linearly with increasing batch size, and our parallelized PyTorch implementation trains significantly faster compared to the Dynet C++ implementation.

Relevant initiatives  

Related knowledge about this paper Reproduced results (crowd-benchmarking and competitions) Artifact and reproducibility checklists Common formats for research projects and shared artifacts Reproducibility initiatives

Comments  

Please log in to add your comments!
If you notice any inapropriate content that should not be here, please report us as soon as possible and we will try to remove it within 48 hours!