Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Leveraging pre-trained embeddings for Welsh Taggers

Ezeani, I, Piao, S, Neale, Steven, Rayson, P and Knight, Dawn ORCID: https://orcid.org/0000-0002-4745-6502 2019. Leveraging pre-trained embeddings for Welsh Taggers. Presented at: 4th Workshop on Representation Learning for NLP, Florence, Italy, July 2019. ACL Anthology: Proceedings of the 4th Workshop on Representation Learning for NLP. , vol.W19-43 Association for Computational Linguistics, -. 10.18653/v1/W19-4332

Full text not available from this repository.

Abstract

While the application of word embedding models to downstream Natural Language Processing (NLP) tasks has been shown to be successful, the benefits for low-resource languages is somewhat limited due to lack of adequate data for training the models. However, NLP research efforts for low-resource languages have focused on constantly seeking ways to harness pre-trained models to improve the performance of NLP systems built to process these languages without the need to re-invent the wheel. One such language is Welsh and therefore, in this paper, we present the results of our experiments on learning a simple multi-task neural network model for part-of-speech and semantic tagging for Welsh using a pre-trained embedding model from FastText. Our model’s performance was compared with those of the existing rule-based stand-alone taggers for part-of-speech and semantic taggers. Despite its simplicity and capacity to perform both tasks simultaneously, our tagger compared very well with the existing taggers.

Item Type: Conference or Workshop Item (Paper)
Date Type: Publication
Status: Published
Schools: English, Communication and Philosophy
Publisher: Association for Computational Linguistics
Last Modified: 26 Oct 2022 08:05
URI: https://orca.cardiff.ac.uk/id/eprint/126545

Actions (repository staff only)

Edit Item Edit Item