Илья Козиев's repositories
NLP_Datasets
My NLP datasets for Russian language
GrammarEngine
Грамматический Словарь Русского Языка (+ английский, японский, etc)
rupostagger
Part-of-Speech Tagger for Russian language
LM-finetune
Код для файнтюна LM (rugpt, LLaMa, FRED T5) средствами transformers + deepspeed + LoRa
rutokenizer
Russian text segmenter and tokenizer
StressModel
Neural model for prediction of stress position in Russian words
paraphraser
Поэтический перефразировщик
vector2text
Generate Russian text using GPT model given LaBSE text embedding vector
LM-pretrain
Char-level language model pretraining code and scripts
ruword2tags
Морфологический анализатор слов для русского языка
transcriber
Model to convert text to phonetic transcription and vice versa
rupostagger2
Простая нейросетевая модель для частеречной разметки
word_embedders
Character-level autoencoder models for words
character-tokenizer
A character tokenizer for HuggingFace Transformers
paraphrase_reranker
Paraphrase detection and reranking model
sent_embedders
Experiments with sentence embedding models
kmeans_pytorch
kmeans using PyTorch
masked_np_language_model
Эксперименты с генеративной языковой моделью (ruGPT) для восстановления именных групп
RuLeanALBERT
RuLeanALBERT is a pretrained masked language model for the Russian language that uses a memory-efficient architecture.
rulm
Language modeling for Russian