Andrei Kalmykov's starred repositories

awesome-nlp

:book: A curated list of resources dedicated to Natural Language Processing (NLP)

ml-engineering

Machine Learning Engineering Open Book

Language:PythonLicense:CC-BY-SA-4.0Stargazers:11333Issues:112Issues:24

scientific-visualization-book

An open access book on scientific visualization using python and matplotlib

Language:PythonLicense:NOASSERTIONStargazers:10678Issues:189Issues:43

ffsubsync

Automagically synchronize subtitles with video.

Language:PythonLicense:MITStargazers:6749Issues:76Issues:150

tensorflow-handbook

简单粗暴 TensorFlow 2 | A Concise Handbook of TensorFlow 2 | 一本简明的 TensorFlow 2 入门指导教程

Language:Jupyter NotebookStargazers:3941Issues:138Issues:26

projector-docker

Run JetBrains IDEs remotely with Docker

Language:ShellLicense:Apache-2.0Stargazers:2235Issues:64Issues:0

contextualized-topic-models

A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).

Language:PythonLicense:MITStargazers:1196Issues:17Issues:109

fastprogress

Simple and flexible progress bar for Jupyter Notebook and console

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1084Issues:23Issues:76

wikipedia2vec

A tool for learning vector representations of words and entities from Wikipedia

Language:PythonLicense:NOASSERTIONStargazers:935Issues:35Issues:68

GENRE

Autoregressive Entity Retrieval

Language:PythonLicense:NOASSERTIONStargazers:760Issues:19Issues:96

chat_templates

Chat Templates for 🤗 HuggingFace Large Language Models

Language:JinjaLicense:MITStargazers:482Issues:7Issues:14

simalign

Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)

Language:PythonLicense:MITStargazers:347Issues:10Issues:33

fuzzyset

A simple fuzzy matching set for python strings

compling_nlp_hse_course

Материалы курса по компьютерной лингвистике Школы Лингвистики НИУ ВШЭ

Language:Jupyter NotebookStargazers:176Issues:8Issues:1

colibri-core

Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.

Language:C++License:GPL-3.0Stargazers:123Issues:11Issues:37

pyphrasy

Inflection russian collocations based on pymorphy2

text-classification-baseline

Pipeline for fast building text classification TF-IDF + LogReg baselines.

Language:PythonLicense:MITStargazers:63Issues:2Issues:50

annotated-transformer

http://nlp.seas.harvard.edu/2018/04/03/attention.html

Language:Jupyter NotebookLicense:MITStargazers:62Issues:1Issues:0

GEM-metrics

Automatic metrics for GEM tasks

Language:PythonLicense:MITStargazers:61Issues:3Issues:61

rulm-sbs2

Бенчмарк сравнивает русские аналоги ChatGPT: Saiga, YandexGPT, Gigachat

Language:Jupyter NotebookStargazers:55Issues:3Issues:1
Language:C#License:MITStargazers:54Issues:3Issues:3

matstat_lec

Лекции по матстату на русском

Language:TeXLicense:MITStargazers:35Issues:4Issues:1

RuSentEval

Probing suite for evaluation of Russian embedding and language models

Language:PythonLicense:Apache-2.0Stargazers:32Issues:5Issues:2

deep_learning_tf

deep learning with tensorflow (russian)

Language:Jupyter NotebookLicense:MITStargazers:31Issues:4Issues:0

rugpt3-question-generation

Generate questions based on text in Russian

Language:Jupyter NotebookStargazers:27Issues:2Issues:0

mmlu_ru

MMLU eval for RU/EN

Language:PythonStargazers:14Issues:0Issues:1

neural_nets_prob

Понимаем как работают нейросетки на ручных задачках :)

Language:TeXStargazers:12Issues:4Issues:0

neural_nets_dpo

Deep learning course for HSE continuing education program

Language:Jupyter NotebookLicense:GPL-3.0Stargazers:4Issues:2Issues:0