There are 1 repository under ngrams topic.
Covid-19 Twitter dataset for non-commercial research use and pre-processing scripts - under active development
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
Next-token prediction in JavaScript — build fast language and diffusion models.
NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.
A fast and reliable PHP library for detecting languages
Python implementation of an N-gram language model with Laplace smoothing and sentence generation.
A flexible and general-purpose ngrams library written in Ruby. Raingrams supports ngram sizes greater than 1, text/non-text grams, multiple parsing styles and open/closed vocabulary models.
This Repository Contains Solution to the Assignments of the Natural Language Processing Specialization from Deeplearning.ai on Coursera Taught by Younes Bensouda Mourri, Łukasz Kaiser, Eddy Shyu
Typing Assistant provides the ability to autocomplete words and suggests predictions for the next word. This makes typing faster, more intelligent and reduces effort.
Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code
Next Word Prediction using n-gram Probabilistic Model with various Smoothing Techniques
Rust library providing fast language model queries in compressed space
A C++ library implementing fast language models estimation using the 1-Sort algorithm.
This project is an auto-filling text program implemented in Python using N-gram models. The program suggests the next word based on the input given by the user. It utilizes N-gram models, specifically Trigrams and Bigrams, to generate predictions.
Detecting Malware in PE files
An R-based guide to sampling Google n-gram data, building historical term-feature matrices & investigating lexical semantic change historically.
Word generation based on n-gram models, and a cli utility to generate said models.
:cake: A library for creating n-grams, skip-grams, bag of words, bag of n-grams, bag of skip-grams.
Determining the similarity of alphanumeric text based on trigram matching.
text mining, regex, N-grams, fuzzy matching
Pipeline for training Language Models using PyTorch.
Model Generator for Firestore
NGRAMS is a search engine for the Google Books Ngram Dataset. This repository contains documentation, discussions, announcements, and issues.
Jupyter Notebook for Natural Language Processing learning