This repo contains an R Jupyter notebook on developing an ngram model for word prediction based on data provided by the capstone course in the Data Science Specialization by Johns Hopkins University on Coursera.
The Shiny app can be found here.
trainsubsetting.R
is the file used to reduce the amount of data used to create an ngram model. In this case, the reduced data is about 1% of the original data set.
ngramgenerator.R
is the file used to generate ngrams from the reduced data set.