thakur-nandan

Nandan Thakur's repositories

sprint

SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.

Language:PythonApache-2.040 6 8

income

INCOME: An Easy Repository for Training and Evaluation of Index Compression Methods in Dense Retrieval. Includes BPR and JPQ.

Language:PythonApache-2.022 3 1

beir-ColBERT

Evaluation of BEIR Datasets using ColBERT retrieval model

Language:PythonMIT1300

topic-modeling

This repository contains as intuitive example on topic-modeling using regular LDA, and how GuidedLDA is better than regular LDA

Language:Jupyter NotebookMIT7 1 1

beir-JPQ

CIKM'21: JPQ substantially improves the efficiency of Dense Retrieval with 30x compression ratio, 10x CPU speedup and 2x GPU speedup.

Language:PythonMIT100

Imagesearch

CS 679 Project Repository: Learning Efficient Autoencoders for Image Search

Language:PythonApache-2.01 10

jekyll-instagram

Language:RubyMIT100

personal-website

Language:HTMLMIT100

poison-texts

CS 886 Project on Adversarial Attacks on NLP models

Language:PythonApache-2.01 10

compute-canada

CC Information provided to easy run slurm scripts on CC Wiki

Apache-2.0010

anserini

A Lucene toolkit for replicable information retrieval research

Language:Java000

BatteryDEV

Our Official Code Repositorty for QS-EIS-Challenge BatteryDEV 2022

Language:PythonMIT010

beir-leaderboard

BEIR Leaderboard

Language:HTMLApache-2.0020

citadel-repro

A reproduction of CITADEL and CITADEL+ checkpoints using dpr-scale repository

Language:PythonNOASSERTION020

CQADupStack

A Benchmark Data Set for Community Question-Answering Research

Language:PythonApache-2.0000

datasets

🤗 The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Language:PythonApache-2.0000

Deep-Learning

Language:Jupyter NotebookMIT000

DeepCT

DeepCT and HDCT uses BERT to generate novel, context-aware bag-of-words term weights for documents and queries.

Language:PythonBSD-3-Clause000

hf-upload

Sample scripts used for uploading bulk datasets and models to HF

Language:Shell010

mGTRR

Easy to use Multi-GPU Training of Retriever and Reranker

Apache-2.0010

mteb

MTEB: Massive Text Embedding Benchmark

Language:PythonApache-2.0000

orpo

Official repository for ORPO

Language:Python000

peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Language:PythonApache-2.0000

qra_code

Question similarity with domain adaptation.

Language:Python000

sentence-transformers

Sentence Embeddings with BERT & XLNet

Language:PythonApache-2.0000

tevatron

Tevatron - A flexible toolkit for neural retrieval research and development.

Language:PythonApache-2.0000

thakur-nandan.github.io

Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes

Language:JavaScriptMIT000

video-insights

video insights created and using open-sourced packages

Language:Jupyter Notebook000

words

Language:HTML000

words-urvashi

Language:HTML000