Xueguang Ma 马雪光's repositories
bertserini
BERTserini
ANCE
A novel embedding training algorithm leveraging ANN search and achieved SOTA retrieval on Trec DL 2019 and OpenQA benchmarks
anserini
A Lucene toolkit for replicable information retrieval research
bootstrap
The most popular HTML, CSS, and JavaScript framework for developing responsive, mobile first projects on the web.
COIL
NAACL2021 - COIL Contextualized Lexical Retriever
Dense
A toolkit for building dense retrievers with deep language models.
FARM
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
mesh
Mesh TensorFlow: Model Parallelism Made Easier
MSMARCO-Document-Ranking-Submissions
Submission archive for the MS MARCO document ranking leaderboard
MSMARCO-Passage-Ranking-Submissions
Submission archive for the MS MARCO passage ranking leaderboard
MXueguang.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
natural-questions
Natural Questions (NQ) contains real user questions issued to Google search, and answers found from Wikipedia by annotators. NQ is designed for the training and evaluation of automatic question answering systems.
presidio
Context aware, pluggable and customizable data protection and PII data anonymization service for text and images
pygaggle
a gaggle of deep neural architectures for text ranking and question answering, designed for Pyserini
scifact
Data and models for the SciFact verification task.
self-rag
This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
staged-recipes
A place to submit conda recipes before they become fully fledged conda-forge feedstocks
tevatron
Tevatron - A flexible toolkit for dense retrieval research and development.
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.