Michael Günther's repositories
postgres-word2vec
utils to use word embedding models like word2vec vectors in a PostgreSQL database
table-embeddings
Tools for training schema-aware Web table embedding for unsupervised and supervised machine learning on tabular data
the-movie-database-import
Script to import data from the The Movie Database to PostgreSQL (Dataset URL: https://www.kaggle.com/rounakbanik/the-movies-dataset
postgres-retrofit
Tools to create database-specific text value embeddings from word embedding datasets
google-play-dataset-import
Script to import data from a Google Play Store Apps dataset to a PostgreSQL database (Dataset URL: https://www.kaggle.com/lava18/google-play-store-apps)
open-food-facts-postgresql-import
Script to import data from the Open Food Facts to PostgreSQL (Dataset URL: https://www.kaggle.com/openfoodfacts/world-food-facts)
mteb
MTEB: Massive Text Embedding Benchmark
NLP-OSS
Democratizing NLP!
SimilarityMeasure
Compute for one node in a graph the most similar one
test-gradient-cache
Small test script of gradient cache (https://github.com/luyug/GradCache) applied to train a model for a retrieval task on the SciFact dataset (https://allenai.org/data/scifact)