allisonyw / matser-thesis-better-embedding

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

matser-thesis-better-embedding

Using signatures in BERT and bi-LSTM NLP neural networks.

This is the code for the master thesis 'On a better representation of natural language in vector spaces', Yiran Wei, 2019

The signature calculation uses the Signatory library.

Data

Datasets as introduced in the thesis can be found in the data file.

sig-transformer

Modules called in the SIG+ notebooks have signature methods implemented. The truncation order are indicated by the names. For example, (reduced dimension, truncation order)=(4,2) would correspond to modelling_bert42.py.

About


Languages

Language:Jupyter Notebook 74.3%Language:Python 25.7%