Jordan T Bates (jtbates)

jtbates

Geek Repo

Company:@liftoffio

Location:Washington

Github PK Tool:Github PK Tool


Organizations
dssg

Jordan T Bates's starred repositories

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonLicense:Apache-2.0Stargazers:38407Issues:384Issues:1622

recommenders

Best Practices on Recommendation Systems

Language:PythonLicense:MITStargazers:18520Issues:273Issues:850

prql

PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement

Language:RustLicense:Apache-2.0Stargazers:9665Issues:45Issues:977

open_clip

An open source implementation of CLIP.

Language:PythonLicense:NOASSERTIONStargazers:9342Issues:75Issues:455

petals

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

Language:PythonLicense:MITStargazers:8973Issues:90Issues:195

vowpal_wabbit

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

Language:C++License:NOASSERTIONStargazers:8451Issues:353Issues:1267

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7690Issues:88Issues:1649

smile

Statistical Machine Intelligence & Learning Engine

Language:JavaLicense:NOASSERTIONStargazers:5984Issues:270Issues:602

SynapseML

Simple and Distributed Machine Learning

Language:ScalaLicense:MITStargazers:5023Issues:146Issues:716

fsrs4anki

A modern Anki custom scheduling based on Free Spaced Repetition Scheduler algorithm

Language:Jupyter NotebookLicense:MITStargazers:2424Issues:27Issues:379

SentEval

A python tool for evaluating the quality of sentence embeddings.

Language:PythonLicense:NOASSERTIONStargazers:2071Issues:47Issues:58

omegaconf

Flexible Python configuration system. The last one you will ever need.

Language:PythonLicense:BSD-3-ClauseStargazers:1893Issues:18Issues:552

Cream

This is a collection of our NAS and Vision Transformer work.

Language:PythonLicense:MITStargazers:1626Issues:36Issues:156

MetaCLIP

ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering

Language:PythonLicense:NOASSERTIONStargazers:1129Issues:13Issues:24

uform

Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️

Language:PythonLicense:Apache-2.0Stargazers:983Issues:14Issues:25

monolith

ByteDance's Recommendation System

Language:PythonLicense:NOASSERTIONStargazers:842Issues:48Issues:14

gnn-model-explainer

gnn explainer

Language:PythonLicense:Apache-2.0Stargazers:836Issues:22Issues:30

tensordict

TensorDict is a pytorch dedicated tensor container.

Language:PythonLicense:MITStargazers:680Issues:31Issues:101

robust_loss_pytorch

A pytorch port of google-research/google-research/robust_loss/

Language:PythonLicense:Apache-2.0Stargazers:651Issues:13Issues:32

tevatron

Tevatron - A flexible toolkit for neural retrieval research and development.

Language:PythonLicense:Apache-2.0Stargazers:443Issues:10Issues:87

InfoGraph

Official code for "InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization" (ICLR 2020, spotlight)

LargeBatchCTR

Large batch training of CTR models based on DeepCTR with CowClip.

Language:PythonLicense:Apache-2.0Stargazers:161Issues:4Issues:1

space

Unified storage framework for the entire machine learning lifecycle

Language:PythonLicense:Apache-2.0Stargazers:144Issues:9Issues:3

dviz-course

Data visualization course material

Language:TeXLicense:MITStargazers:135Issues:11Issues:7

fasttrackml

Experiment tracking server focused on speed and scalability

Language:GoLicense:Apache-2.0Stargazers:96Issues:15Issues:430

CachedEmbedding

A memory efficient DLRM training solution using ColossalAI

Language:PythonLicense:Apache-2.0Stargazers:91Issues:6Issues:3

wukong-recommendation

Implements the paper "Wukong: Towards a Scaling Law for Large-Scale Recommendation" from Meta.

Language:PythonLicense:MITStargazers:30Issues:1Issues:1

autoshard

[KDD 2022] AutoShard: Automated Embedding Table Sharding for Recommender Systems

Language:PythonLicense:MITStargazers:21Issues:5Issues:2

CEDS-Data-Warehouse-Parquet

The Common Education Data Standards (CEDS) Data Warehouse Parquet (DW Parquet) standard is designed for data engineering and data science needs in the cloud. The DW Parquet Models mirror the SQL-based CEDS Data Warehouse. Parquet files are designed for rapid and distributed reporting across multiple technology stacks, data processing and BI tools, and are cloud vendor agnostic. This standard is ideal for stakeholders implementing reporting structures in a data lake environment.

Language:PythonLicense:Apache-2.0Stargazers:10Issues:7Issues:0

SeqTestBlog

Simulation files for Schultzberg & Ankargren blogpost 2023

Language:RStargazers:6Issues:0Issues:0