Binoy Dalal (bdalal)

bdalal

Geek Repo

Company:@cresta

Location:Raleigh, NC

Github PK Tool:Github PK Tool

Binoy Dalal's starred repositories

llm.c

LLM training in simple, raw C/CUDA

Language:CudaLicense:MITStargazers:23695Issues:0Issues:0

llm_distillation_playbook

Best practices for distilling large language models.

Language:Jupyter NotebookStargazers:378Issues:0Issues:0

List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words

List of Dirty, Naughty, Obscene, and Otherwise Bad Words

License:CC-BY-4.0Stargazers:2899Issues:0Issues:0

cv-sentence-extractor

Scraping Wikipedia for fair use sentences

Language:RustStargazers:52Issues:0Issues:0
Language:Jupyter NotebookLicense:MITStargazers:655Issues:0Issues:0

noTunes

A simple macOS application that will prevent iTunes or Apple Music from launching.

Language:SwiftLicense:MITStargazers:3623Issues:0Issues:0

navi

An interactive cheatsheet tool for the command-line

Language:RustLicense:Apache-2.0Stargazers:14916Issues:0Issues:0

llm-course

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:37738Issues:0Issues:0

silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Language:PythonLicense:MITStargazers:4089Issues:0Issues:0

interviews.ai

It is my belief that you, the postgraduate students and job-seekers for whom the book is primarily meant will benefit from reading it; however, it is my hope that even the most experienced researchers will find it fascinating as well.

Stargazers:4504Issues:0Issues:0

120-Data-Science-Interview-Questions

Answers to 120 commonly asked data science interview questions.

Stargazers:3697Issues:0Issues:0

teaching

Open-Source Information Retrieval Courses @ TU Wien

Language:PythonLicense:GPL-3.0Stargazers:590Issues:0Issues:0

docTTTTTquery

docTTTTTquery document expansion model

Language:PythonLicense:Apache-2.0Stargazers:354Issues:0Issues:0

matchmaker

Training & evaluation library for text-based neural re-ranking and dense retrieval models built with PyTorch

Language:PythonLicense:Apache-2.0Stargazers:260Issues:0Issues:0

CtCI-6th-Edition-Python

Cracking the Coding Interview 6th Ed. Python Solutions

Language:PythonStargazers:4947Issues:0Issues:0

AugLy

A data augmentations library for audio, image, text, and video.

Language:PythonLicense:NOASSERTIONStargazers:4949Issues:0Issues:0

TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

Language:C++License:Apache-2.0Stargazers:10615Issues:0Issues:0

detext

DeText: A Deep Neural Text Understanding Framework for Ranking and Classification Tasks

Language:PythonLicense:BSD-2-ClauseStargazers:1263Issues:0Issues:0

onnx-simplifier

Simplify your onnx model

Language:C++License:Apache-2.0Stargazers:3794Issues:0Issues:0

I-BERT

[ICML'21 Oral] I-BERT: Integer-only BERT Quantization

Language:PythonLicense:MITStargazers:225Issues:0Issues:0

nlp-architect

A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks

Language:PythonLicense:Apache-2.0Stargazers:2936Issues:0Issues:0

ldig

Language Detection with Infinity-gram

Language:C++Stargazers:230Issues:0Issues:0

pretrain-gnns

Strategies for Pre-training Graph Neural Networks

Language:PythonLicense:MITStargazers:957Issues:0Issues:0

FBTT-Embedding

This is a Tensor Train based compression library to compress sparse embedding tables used in large-scale machine learning models such as recommendation and natural language processing. We showed this library can reduce the total model size by up to 100x in Facebook’s open sourced DLRM model while achieving same model quality. Our implementation is faster than the state-of-the-art implementations. Existing the state-of-the-art library also decompresses the whole embedding tables on the fly therefore they do not provide memory reduction during runtime of the training. Our library decompresses only the requested rows therefore can provide 10,000 times memory footprint reduction per embedding table. The library also includes a software cache to store a portion of the entries in the table in decompressed format for faster lookup and process.

Language:CudaLicense:MITStargazers:192Issues:0Issues:0

dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.

Language:PythonLicense:Apache-2.0Stargazers:13406Issues:0Issues:0

langdetect

Port of Google's language-detection library to Python.

Language:PythonLicense:NOASSERTIONStargazers:1716Issues:0Issues:0

cs-video-courses

List of Computer Science courses with video lectures.

Stargazers:66639Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:251Issues:0Issues:0

pytorch_geometric

Graph Neural Network Library for PyTorch

Language:PythonLicense:MITStargazers:21081Issues:0Issues:0

lw-k8s-workshop

Hands-on Labs for Fusion 5 Kubernetes Workshop

Language:ShellStargazers:1Issues:0Issues:0