Rui Meng's repositories
OpenNMT-kpg-release
Keyphrase Generation
academic-budget-bert
Repository containing code for "How to Train BERT with an Academic Budget" paper
ANCE
A novel embedding training algorithm leveraging ANN search and achieved SOTA retrieval on Trec DL 2019 and OpenQA benchmarks
apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
beir
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
ColBERT
ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21)
contriever
Contriever Towards Unsupervised Dense Information Retrieval with Contrastive Learning
datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
DHR
This is the repository of the Dense Hierarchical Retrieval for Open-Domain Question Answering
emdr2
Code and Models for the paper "End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering" (NeurIPS 2021)
fairscale
PyTorch extensions for high performance and large scale training.
pyserini
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
pytorchviz
A small package to create visualizations of PyTorch execution graphs
sentence-transformers
Multilingual Sentence & Image Embeddings with BERT
Sentence-VAE
PyTorch Implementation of "Generating Sentences from a Continuous Space" by Bowman et al 2015 https://arxiv.org/abs/1511.06349
SentEval
A python tool for evaluating the quality of sentence embeddings.
SimCSE
EMNLP'2021: SimCSE: Simple Contrastive Learning of Sentence Embeddings
wikiextractor
A tool for extracting plain text from Wikipedia dumps
yttm_transformers_tokenizer
Implementation of youtokentome tokenizer for transformers