Илья Козиев (Koziev)

Koziev

Geek Repo

Location:Russia

Home Page:https://kelijah.livejournal.com/

Github PK Tool:Github PK Tool

Илья Козиев's repositories

NLP_Datasets

My NLP datasets for Russian language

Language:C#License:CC0-1.0Stargazers:341Issues:24Issues:3

chatbot

Русскоязычный генеративный чатбот с профилем и фактами

GrammarEngine

Грамматический Словарь Русского Языка (+ английский, японский, etc)

Language:C++License:MITStargazers:73Issues:9Issues:18

rulemma

Лемматизатор для русскоязычных текстов

Language:PythonLicense:MITStargazers:40Issues:3Issues:4

verslibre

Using transformers to generate Russian poetry

Language:PythonLicense:UnlicenseStargazers:31Issues:3Issues:0

rusyllab

Simple Python package for breaking Russian words into syllables

Language:PythonLicense:GPL-3.0Stargazers:27Issues:2Issues:3

rupostagger

Part-of-Speech Tagger for Russian language

Language:PythonLicense:GPL-3.0Stargazers:19Issues:3Issues:3

pushkin

Генеративные текстовые модели

Language:PythonStargazers:14Issues:3Issues:0

LM-finetune

Код для файнтюна LM (rugpt, LLaMa, FRED T5) средствами transformers + deepspeed + LoRa

Language:PythonLicense:CC0-1.0Stargazers:12Issues:2Issues:0

rutokenizer

Russian text segmenter and tokenizer

Language:PythonLicense:GPL-3.0Stargazers:12Issues:3Issues:0

StressModel

Neural model for prediction of stress position in Russian words

Language:PythonLicense:MITStargazers:12Issues:3Issues:1

paraphraser

Поэтический перефразировщик

Language:PythonLicense:GPL-3.0Stargazers:8Issues:1Issues:0

vector2text

Generate Russian text using GPT model given LaBSE text embedding vector

Language:PythonLicense:GPL-3.0Stargazers:4Issues:1Issues:0

LM-pretrain

Char-level language model pretraining code and scripts

Language:PythonLicense:Apache-2.0Stargazers:3Issues:2Issues:0

ruword2tags

Морфологический анализатор слов для русского языка

Language:PythonLicense:GPL-3.0Stargazers:3Issues:1Issues:1

transcriber

Model to convert text to phonetic transcription and vice versa

Language:PythonLicense:MITStargazers:3Issues:2Issues:0
Language:PythonLicense:Apache-2.0Stargazers:2Issues:0Issues:0

rupostagger2

Простая нейросетевая модель для частеречной разметки

Language:PythonStargazers:2Issues:2Issues:0

word_embedders

Character-level autoencoder models for words

Language:PythonLicense:UnlicenseStargazers:2Issues:2Issues:0

character-tokenizer

A character tokenizer for HuggingFace Transformers

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

paraphrase_reranker

Paraphrase detection and reranking model

Language:PythonLicense:MITStargazers:1Issues:2Issues:0

ruchunker

NP chunker for Russian language

sent_embedders

Experiments with sentence embedding models

Language:PythonLicense:UnlicenseStargazers:1Issues:2Issues:0

AGRR-2019

Код моделей для задачи AGRR-2019

Language:PythonLicense:GPL-3.0Stargazers:0Issues:2Issues:0

kmeans_pytorch

kmeans using PyTorch

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

masked_np_language_model

Эксперименты с генеративной языковой моделью (ruGPT) для восстановления именных групп

Language:PythonLicense:CC0-1.0Stargazers:0Issues:1Issues:0

math

Conversational data generator

Language:PythonLicense:MITStargazers:0Issues:1Issues:0
Language:Jupyter NotebookStargazers:0Issues:2Issues:0

RuLeanALBERT

RuLeanALBERT is a pretrained masked language model for the Russian language that uses a memory-efficient architecture.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

rulm

Language modeling for Russian

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0