pingmehard's repositories
data-science-interviews
Data science interview questions and answers
tinkoff-api
tinkoff-api
cian
Parser of cian.ru site and modeling classification to devide photos of the resource into telegramm channel.
newsbot
News Bot allows you to choose and personalize news in one flow.
stratify
Code allows you to strtify dataframe in different convenient ways. ReadMe shows how.
bpe-dropout
Class allows you to create BPE tokens with dropout or not. Implements Sentencepeace lib with easy fit predict way.
word2vec
Word2Vec is an algorythm of word representation in embeddings. This repo contains a code about word2vec only.
tfidf
TF-IDF is a method, which allows get matrix tokens for your list of text. It creates n * m matrix, where n - quantity of texts and m - quantity of unique words - tokens in texts vocabulary. IMPORTANT: This .py is my own variation of tfidf and doesnt duplicate existing tfidf versions. So it could