Craig Schmidt's starred repositories
sentence-transformers
Multilingual Sentence & Image Embeddings with BERT
sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
tokenizers
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
Riskfolio-Lib
Portfolio Optimization and Quantitative Strategic Asset Allocation in Python
AlphaZero.jl
A generic, simple and fast implementation of Deepmind's AlphaZero algorithm.
SBERT-WK-Sentence-Embedding
IEEE/ACM TASLP 2020: SBERT-WK: A Sentence Embedding Method By Dissecting BERT-based Word Models
sequence_align
Efficient implementations of Needleman-Wunsch and other sequence alignment algorithms written in Rust with Python bindings via PyO3.
aiohttp-scraper
A robust asynchronous web scraping client using aiohttp.
fold_to_ascii
A Python port of the Apache Lucene ASCII Folding Filter that converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the ‘Basic Latin’ Unicode block) into ASCII equivalents, if they exist.
public-woo-api
Wordpress plugin for work with woocommerce rest api
Stochastic_Dominance
Functions for portfolio optimization under second order stochastic dominance constraints
tree-isomorphism-test
An isomorphism test for trees, using NetworkX's data structures (not the algorithm!).
lukes-hugo-theme
My personal Hugo theme.