uoe-mphil_thesis

My master of philosophy thesis that I defended at University of Edinburgh in 2021

ABSTRACT

Contextual morphological analysis is the task of finding the most probable lemma and morpho-syntactic description (i.e. part of speech and grammatical markers, such as case, tense, etc.) for a given word in a given context. Historically, approaches to the task relied on using few words of local context due to both model design (e.g. HMM) and data sparseness concerns. With the advent of deep learning, resorting to local context ceased to be a necessity, and modern approaches exclusively use global (sentential) context. In this thesis we investigate whether context restriction can still be useful for neural morphological analysis. For our first set of experiments we adapt a character-level encoder-decoder model that was previously used for related tasks of lemmatization and morphological generation. We start by showing that using just one word of surrounding context not only yields better results, but is also more efficient than using global context. Then, on a data set of more than hundred corpora, we show that relying on larger context windows is preferable only when training data is sufficiently large, while using a single word of context is better both in low resource scenarios and on average. We also discuss competitive performance of our model at SIGMORPHON-2019 shared task on contextual morphological analysis, where it was bested only by the systems that used pre-trained contextualized and/or regular word embeddings. Finally, we show that augmenting our model with contextualized word embeddings does not increase its performance. Inspired by the success of our character-level model, in our second set of experiments we try context restriction with a popular off-the-shelf word-level neural morphological analyzer. Here too, we show, that when training data is scarce, limiting context to a few words does improve performance, especially for agglutinative and fusional languages. However, with enough data using global context is still better. To investigate what restricted models miss from global context, in a follow-up experiment we show, that context restriction hinders the ability of the model to correctly analyze words, whose dependency heads are left beyond the context window. Finally, we find, that to improve performance on small data sets, one does not even have to train in a context-restricted manner. It is enough to limit context at inference time to achieve comparable performance.

makazhan / uoe-mphil_thesis

uoe-mphil_thesis

About