There are 1 repository under ngrams topic.
Covid-19 Twitter dataset for non-commercial research use and pre-processing scripts - under active development
Next-token prediction in JavaScript — build fast language and diffusion models.
NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.
A fast and reliable PHP library for detecting languages
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code
Python implementation of an N-gram language model with Laplace smoothing and sentence generation.
This Repository Contains Solution to the Assignments of the Natural Language Processing Specialization from Deeplearning.ai on Coursera Taught by Younes Bensouda Mourri, Łukasz Kaiser, Eddy Shyu
A flexible and general-purpose ngrams library written in Ruby. Raingrams supports ngram sizes greater than 1, text/non-text grams, multiple parsing styles and open/closed vocabulary models.
Typing Assistant provides the ability to autocomplete words and suggests predictions for the next word. This makes typing faster, more intelligent and reduces effort.
Next Word Prediction using n-gram Probabilistic Model with various Smoothing Techniques
A deep learning project using fine-tuned RoBERTa to classify mental health sentiments from text, aiming to provide early insights and support. ⚕️❤️
Detecting Malware in PE files
Rust library providing fast language model queries in compressed space
NGRAMS is a search engine for the Google Books Ngram Dataset. This repository contains documentation, discussions, announcements, and issues.
Slides, exercises, and exams for my course "Natural Language Processing" (École Pour l'Informatique et les Techniques Avancées, 2024 and 2025)
A C++ library implementing fast language models estimation using the 1-Sort algorithm.
Word generation based on n-gram models, and a cli utility to generate said models.
This project is an auto-filling text program implemented in Python using N-gram models. The program suggests the next word based on the input given by the user. It utilizes N-gram models, specifically Trigrams and Bigrams, to generate predictions.
Model Generator for Firestore
An R-based guide to sampling Google n-gram data, building historical term-feature matrices & investigating lexical semantic change historically.
Determining the similarity of alphanumeric text based on trigram matching.
:cake: A library for creating n-grams, skip-grams, bag of words, bag of n-grams, bag of skip-grams.
text mining, regex, N-grams, fuzzy matching
Pipeline for training Language Models using PyTorch.
BLEU Score in Rust