SFU Natural Language Laboratory's repositories
glm-parser
Tree-adjoining grammar based statistical dependency parser using a general linear model (glm).
lensingwikipedia
Lensing Wikipedia is an interface to visually browse through human history as represented in Wikipedia. This the source code that runs the website:
HMM-Aligner
This is the implementation of word aligner using Hidden Markov Model
SFUTranslate
Neural Machine Translation Toolkit by Natlang Laboratory at SFU
neural-network-tagger
A General Purpose Tagger for POS Tagging, NER Tagging, and Chunking.
WordEmbeddingsViz
Visualize bilingual word embeddings.
xtag-english-grammar
The XTAG English Grammar mirrored from the XTAG page http://www.cis.upenn.edu/~xtag/
pe-decipher-toolkit
Python notebook for some basic NLP analysis over the CDLI Proto-Elamite data
sfu-natlang.github.io
website for the SFU natural language lab
stag-txt2tex
Simple macros for converting bracketed trees into LaTeX STAG tree-pairs.
align-type-tacl2017-code
Joint prediction of word alignment with alignment types
cgw-samples
Repository to handle uploads for groups for the cgw in class exercise.
cipherdaug-nmt
Official code for the paper CipherDAug: Ciphertext based Data Augmentation for Neural Machine Translation published at ACL 2022 main conference.
gpu-monitor
A GPU resource monitor for our lab.
gut-besser-chunker
The program used in the paper 'Gut, Besser, Chunker – Selecting the best models for text chunking with voting' by Balázs Indig and István Endrédy
NeuroDecipher
Fork of the code from the ACL paper Neural Decipherment via Minimum-Cost Flow: from Ugaritic to Linear B.
pe-compositionality
Code and data for "Compositionality of Complex Graphemes in the Undeciphered Proto-Elamite Script using Image and Text Embedding Models" in Findings of ACL 2021.
pe-headers
Updated transliterations and labels for the study of headers in proto-Elamite.
pe-pc-datasets
Data derived from the CDLI proto-Elamite and proto-cuneiform corpora.
pe-sign-value-data
Proto-Elamite corpus with hypothetical sound values inserted.
sane-decipher
Computational Decipherment of Scripts from the Ancient Near East
target_rescale_siMT
Official code for IWSLT 2023 paper Language Model Based Target Token Importance Rescaling for Simultaneous Neural Machine Translation