dblalock

dblalock

Geek Repo

Company:MosaicML

Location:San Francisco, CA

Home Page:https://dblalock.substack.com

Github PK Tool:Github PK Tool

dblalock's starred repositories

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonLicense:Apache-2.0Stargazers:33228Issues:335Issues:2575

applied-ml

📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.

modern-cpp-features

A cheatsheet of modern C++ language and library features.

hammerspoon

Staggeringly powerful macOS desktop automation with Lua

Language:Objective-CLicense:MITStargazers:11592Issues:118Issues:2458

scalene

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals

Language:PythonLicense:Apache-2.0Stargazers:11275Issues:90Issues:455

ml-engineering

Machine Learning Engineering Open Book

Language:PythonLicense:CC-BY-SA-4.0Stargazers:10038Issues:103Issues:18

pretrained-models.pytorch

Pretrained ConvNets for pytorch: NASNet, ResNeXt, ResNet, InceptionV4, InceptionResnetV2, Xception, DPN, etc.

Language:PythonLicense:BSD-3-ClauseStargazers:8967Issues:217Issues:180

einops

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

Language:PythonLicense:MITStargazers:8023Issues:68Issues:171

composer

Supercharge Your Model Training

Language:PythonLicense:Apache-2.0Stargazers:5042Issues:52Issues:524

DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

Language:C++License:Apache-2.0Stargazers:4949Issues:94Issues:1557

readyset

Readyset is a MySQL and Postgres wire-compatible caching layer that sits in front of existing databases to speed up queries and horizontally scale read throughput. Under the hood, ReadySet caches the results of cached select statements and incrementally updates these results over time as the underlying data changes.

Language:RustLicense:NOASSERTIONStargazers:3958Issues:21Issues:478

llm-foundry

LLM training code for Databricks foundation models

Language:PythonLicense:Apache-2.0Stargazers:3776Issues:46Issues:358

annotated_latex_equations

Examples of how to create colorful, annotated equations in Latex using Tikz.

Language:TeXLicense:MITStargazers:3712Issues:37Issues:3

differential-privacy

Google's differential privacy libraries.

Language:GoLicense:Apache-2.0Stargazers:2992Issues:117Issues:75

hlb-CIFAR10

Train to 94% on CIFAR-10 in <6.3 seconds on a single A100. Or ~95.79% in ~110 seconds (or less!)

Language:PythonLicense:Apache-2.0Stargazers:1190Issues:20Issues:3

Neighborhood-Attention-Transformer

Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022

Language:PythonLicense:MITStargazers:1005Issues:16Issues:74

streaming

A Data Streaming Library for Efficient Neural Network Training

Language:PythonLicense:Apache-2.0Stargazers:975Issues:19Issues:133

imagenette

A smaller subset of 10 easily classified classes from Imagenet, and a little more French

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:904Issues:12Issues:24

bagua

Bagua Speeds up PyTorch

Language:PythonLicense:MITStargazers:868Issues:16Issues:145

ImageNet21K

Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(NeurIPS, 2021) paper

Language:PythonLicense:MITStargazers:702Issues:11Issues:71

uarch-bench

A benchmark for low-level CPU micro-architectural features

Language:C++License:MITStargazers:666Issues:34Issues:85

UniverSeg

UniverSeg: Universal Medical Image Segmentation

Language:PythonLicense:Apache-2.0Stargazers:472Issues:9Issues:28

examples

Fast and flexible reference benchmarks

Language:ShellLicense:Apache-2.0Stargazers:418Issues:16Issues:37

shrinkbench

PyTorch library to facilitate development and standardized evaluation of neural network pruning methods.

Language:PythonLicense:MITStargazers:412Issues:17Issues:22

papers-with-video

A browser extension that links video explanations to research papers on arxiv.org

Language:JavaScriptLicense:MITStargazers:411Issues:14Issues:3

SGEMM_CUDA

Fast CUDA matrix multiplication from scratch

Language:CudaLicense:MITStargazers:299Issues:2Issues:4

halutmatmul

Hashed Lookup Table based Matrix Multiplication (halutmatmul) - Stella Nera accelerator

Language:PythonLicense:MITStargazers:204Issues:10Issues:3

custom_matmul_kernels

Customized matrix multiplication kernels

Language:Jupyter NotebookLicense:GPL-3.0Stargazers:49Issues:1Issues:4

hfta

Boost hardware utilization for ML training workloads via Inter-model Horizontal Fusion

Language:PythonLicense:MITStargazers:32Issues:6Issues:19
Language:PythonLicense:Apache-2.0Stargazers:17Issues:3Issues:0