Open library

This portal has been archived. Explore the next generation of this technology.

Sampling strategies in Siamese Networks for unsupervised speech representation learning

lib:80264434d9c9b82c (v1.0.0)

Vote to reproduce this paper and share portable workflows ▲ 1 ▼

Authors: Rachid Riad,Corentin Dancette,Julien Karadayi,Neil Zeghidour,Thomas Schatz,Emmanuel Dupoux
ArXiv: 1804.11297
Document: PDF DOI

Artifact development version: GitHub

Abstract URL: http://arxiv.org/abs/1804.11297v2

Recent studies have investigated siamese network architectures for learning invariant speech representations using same-different side information at the word level. Here we investigate systematically an often ignored component of siamese networks: the sampling procedure (how pairs of same vs. different tokens are selected). We show that sampling strategies taking into account Zipf's Law, the distribution of speakers and the proportions of same and different pairs of words significantly impact the performance of the network. In particular, we show that word frequency compression improves learning across a large range of variations in number of training pairs. This effect does not apply to the same extent to the fully unsupervised setting, where the pairs of same-different words are obtained by spoken term discovery. We apply these results to pairs of words discovered using an unsupervised algorithm and show an improvement on state-of-the-art in unsupervised representation learning using siamese networks.

Relevant initiatives

Related knowledge about this paper

Search on this portal

Reproduced results (crowd-benchmarking and competitions)

Artifact and reproducibility checklists

Common formats for research projects and shared artifacts

Collective Knowledge (organizing research projects based on FAIR principles)

Reproducibility initiatives

Comments

Please log in to add your comments!

If you notice any inapropriate content that should not be here, please report us as soon as possible and we will try to remove it within 48 hours!

Sampling strategies in Siamese Networks for unsupervised speech representation learning

Relevant initiatives Hide

Comments Hide

Relevant initiatives

Comments