Aleksei Dorkin's repositories
ancient-lang-adapters
Code for submissions to SIGTYP 2024, EvaLatin 2024, and AXOLOTL 2024 shared tasks
GNNs-Recipe
A recipe to study Graph Neural Networks (GNNs)
QualiaAnnotationUI
A prototype UI for annotation of qualia relations between FrameNet Lexical Units inferred from an external knowledge base
adapters
A Unified Library for Parameter-Efficient and Modular Transfer Learning
awesome-huggingface
š¤ A list of wonderful open-source projects & applications integrated with Hugging Face libraries.
BabelNetExtractor
A scala tool to extract data from local BabelNet indices
biomedical
Tools for curating biomedical training data for large-scale language modeling
c2xg
A Python package for learning, evaluating, annotating, and extracting vector representations of construction grammars
CondViT-LRVSF
Official Implementation of Conditional ViT on LAION ā Referred Visual Search ā Fashion
course
The Hugging Face course
datasketch
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble
DeepPavlov
An open source library for deep learning end-to-end dialog systems and chatbots.
diffusers
š¤ Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
FnSenseMapper
A tool to map FrameNet Lexical Units to BabelNet synsets using the distance between sentence embeddings of corresponding definitions
lexicon-enhanced-lemmatization
Neural encoder-decoder model for lemmatization
min-dalle
min(DALLĀ·E) is a fast, minimal port of DALLĀ·E Mini to PyTorch
moondream
tiny vision language model
MorphyNet
MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)
neural-transducer
This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.
rq-vae-transformer
The official implementation of Autoregressive Image Generation using Residual Quantization (CVPR '22)
SetSimilaritySearch
All-pair set similarity search on millions of sets in Python and on a laptop
stanza
Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages