There are 0 repository under languagemodel topic.
CodeGen is an open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
[NeurIPS'22 Spotlight] A Contrastive Framework for Neural Text Generation
This is the official code for the paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning (NeurIPS22).
高性能小模型测评 Shared Tasks in NLPCC 2020. Task 1 - Light Pre-Training Chinese Language Model for NLP Task
Embeddings: State-of-the-art Text Representations for Natural Language Processing tasks, an initial version of library focus on the Polish Language
The PreTENS shared task hosted at SemEval 2022 aims at focusing on semantic competence with specific attention on the evaluation of language models with respect to the recognition of appropriate taxonomic relations between two nominal arguments (i.e. cases where one is a supercategory of the other, or in extensional terms, one denotes a superset of the other).
Code for "Semi-supervised Formality Style Transfer using Language Model Discriminator and Mutual Information Maximization"
A 78.5% word sense disambiguator based on Transformers and RoBERTa (PyTorch)
Informal to formal dataset mask MLM
Simple next word prediction model from scratch, implemented using only numpy.
Personality test which classifies in four personality types. For the classification is used the natural language processing classification algorithm - Multinomial Naive-Bayes.
The project generates a sentence given a pre-defined starting phrase from the user such as "Ilbierah kont" and the script attempts to build a sentence off of that phrase. Structurally, the generator works in an n-gram fashion but the main structures used to generate the sentences were the unigram, bigram and trigram. The perplexity for each n-gram model was also calculated