Open library

Check the preview of 2nd version of this platform being developed by the open MLCommons taskforce on automation and reproducibility as a free, open-source and technology-agnostic on-prem platform.

Using Resource-Rich Languages to Improve Morphological Analysis of Under-Resourced Languages

lib:191308dd85be3832 (v1.0.0)

Authors: Peter Baumann,Janet Pierrehumbert
Where published: LREC 2014 5
Document: PDF DOI

Abstract URL: https://www.aclweb.org/anthology/L14-1035/

The world-wide proliferation of digital communications has created the need for language and speech processing systems for under-resourced languages. Developing such systems is challenging if only small data sets are available, and the problem is exacerbated for languages with highly productive morphology. However, many under-resourced languages are spoken in multi-lingual environments together with at least one resource-rich language and thus have numerous borrowings from resource-rich languages. Based on this insight, we argue that readily available resources from resource-rich languages can be used to bootstrap the morphological analyses of under-resourced languages with complex and productive morphological systems. In a case study of two such languages, Tagalog and Zulu, we show that an easily obtainable English wordlist can be deployed to seed a morphological analysis algorithm from a small training set of conversational transcripts. Our method achieves a precision of 100{\%} and identifies 28 and 66 of the most productive affixes in Tagalog and Zulu, respectively.

Relevant initiatives

Related knowledge about this paper

Search on this portal

Reproduced results (crowd-benchmarking and competitions)

Artifact and reproducibility checklists

Common formats for research projects and shared artifacts

Collective Knowledge (organizing research projects based on FAIR principles)

Reproducibility initiatives

Comments

Please log in to add your comments!

If you notice any inapropriate content that should not be here, please report us as soon as possible and we will try to remove it within 48 hours!

Using Resource-Rich Languages to Improve Morphological Analysis of Under-Resourced Languages

Relevant initiatives Hide

Comments Hide

Relevant initiatives

Comments