Olga Gureenkova's repositories
HashingTfidfVectorizer
Very fast implementation of tf-idf vectorizer
wikiextractor
A tool for extracting plain text from Wikipedia dumps
CoreferenceData
This corpus of 2684 texts can be used for the training of coreference resolution algorithm for the Russian language. All texts are downloaded from mvd.ru website.
datasharing
The Leek group guide to data sharing
DeepPavlov
An open source library for building end-to-end dialog systems and training chatbots.
doccano-client
A simple client wrapper for doccano API.
google-research
Google AI Research
ProgrammingAssignment2
Repository for Programming Assignment 2 for R Programming on Coursera
roc_curve_test
A simple roc_curve implementation tested on different types of classifiers (discrete, probabilistic and constant)
SpacyTokenizerTest
Implementation of SpaCy tokenizer class.