There are 13 repositories under language-modeling topic.
Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.
The most accurate natural language detection library for Go, suitable for long and short text alike
Keras implementation of BERT with pre-trained weights
A Modern C++ Data Sciences Toolkit
End-to-end ASR/LM implementation with PyTorch
Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology.
Official PyTorch Repo for "ReZero is All You Need: Fast Convergence at Large Depth"
An implementation of DeepMind's Relational Recurrent Neural Networks (NeurIPS 2018) in PyTorch.
This repository contains a collection of papers and resources on Reasoning in Large Language Models.
Curso práctico: NLP de cero a cien 🤗
Lyrics Generator aka Character-level Language Modeling with Multi-layer LSTM Recurrent Neural Network
Use tensorflow's tf.scan to build vanilla, GRU and LSTM RNNs
Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology. (DEPRECATED)
Independently Recurrent Neural Networks (IndRNN) implemented in pytorch.
Comparatively fine-tuning pretrained BERT models on downstream, text classification tasks with different architectural configurations in PyTorch.
基于 Tensorflow,仿 Scikit-Learn 设计的深度学习自然语言处理框架。支持 40 余种模型类,涵盖语言模型、文本分类、NER、MRC、知识蒸馏等各个领域
Awesome resources for in-context learning and prompt engineering: Mastery of the LLMs such as ChatGPT, GPT-3, and FlanT5, with up-to-date and cutting-edge updates.
Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2
Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch
Pre-training of Language Models for Language Understanding
Character-Level language models
Attentive Federated Learning for Private NLM
MirasText
Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implicit Bayesian Inference"
M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer
Recurrent Neural Networks (RNN, GRU, LSTM) and their Bidirectional versions (BiRNN, BiGRU, BiLSTM) for word & character level language modelling in Theano