There are 2 repositories under language-modelling topic.
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
A lightweight implementation of Beam Search for sequence models in PyTorch.
Transformer Based SeqGAN for Language Generation
Pytorch implementation of MaskGAN
Code for my master thesis in Deep Learning: "Generating answers to medical questions using recurrent neural networks"
This repository contains code and data download instructions for the workshop paper "Improving Hierarchical Product Classification using Domain-specific Language Modelling" by Alexander Brinkmann and Christian Bizer.
Language Modelling, CMI vs Perplexity
PyTorch implementations of word embeddings and language modelling.
Memory Based Word Predictor/Language Model http://ilk.uvt.nl/wopr/
Code for paper: "Numeracy Enhances the Literacy of Language Models"
N-Gram language model that learns n-gram probabilities from a given corpus and generates new sentences from it based on the conditional probabilities from the generated words and phrases.
Spelling correction and grammar detection with statistical language models
Various Deep Learning concepts implemented using TensorFlow
BERT-based pre-trained non-autoregressive sequence-to-sequence model
A simple series of programs to train gated recurrent neural networks with PyTorch and generate text based on them.
In this project we will generate the sentences using ngrams
State-of-the-Art Language Modelling in Python with PyTorch.
Sequence CNN network inspired by the WaveNet architecture written in both Tensorflow and PyTorch.
Implementation of Transformer, BERT and GPT models in both Tensorflow 2.0 and PyTorch.
Distributed tensorflow port for character rnn
An Implementation One of Natural Language Processing Method : Language Modelling
Reproduction of CIFAR-10/CIFAR-100 and Penn Treebank experiments to test claims in "LookaheadOptimizer: k steps forward, 1 step back" https://arxiv.org/abs/1907.08610
Reproduction of CIFAR-10/CIFAR-100 and Penn Treebank experiments to test claims in "LookaheadOptimizer: k steps forward, 1 step back" https://arxiv.org/abs/1907.08610
Decoder model for language modelling
This code project is associated with my master thesis (at the company Omicron Ceti AB) that I have done during spring 2019 at Royal Institute of Technology. The report, to which the code project is associated, is included in this Github repository as "report.pdf".
Natural Language Processing topics and projects.
Introduce basic nlp tasks and methods with examples