Mahmoud Zamani's repositories
spark-rapids
Spark RAPIDS plugin - accelerate Apache Spark with GPUs
aws-neuron-sdk
Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services
cutlass
CUDA Templates for Linear Algebra Subroutines
dbrx
Code examples and resources for DBRX, a large language model developed by Databricks
Gaudi-tutorials
Tutorials for running models on First-gen Gaudi and Gaudi2 for Training and Inference. The source files for the tutorials on https://developer.habana.ai/
loghub
A large collection of system log datasets for AI-driven log analytics [ISSRE'23]
Model-References
TensorFlow and PyTorch Reference models for Gaudi(R)
qdrant
Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
gorilla
Gorilla: An API store for LLMs
garnet
Garnet is a remote cache-store from Microsoft Research that offers strong performance (throughput and latency), scalability, storage, recovery, cluster sharding, key migration, and replication features. Garnet can work with existing Redis clients.
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
glow
Compiler for Neural Network hardware accelerators
serve
Serve, optimize and scale PyTorch models in production
kineto
A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.
MultiCoT
Repository to demonstrate Chain of Table reasoning with multiple tables powered by LangGraph
bringup-bench
Bringup-Bench is a collection of standalone minimal library and system dependence benchmarks useful for bringing up newly designed CPUs, accelerators, compilers and operating systems. You probably don't need Bringup-Bench, but if you do, you probably need it badly!
ibex
Ibex is a small 32 bit RISC-V CPU core, previously known as zero-riscy.
AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
googlesearch
googlesearchfile
airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
nextpnr
nextpnr portable FPGA place and route tool
ompi
Open MPI main development repository
langchain
⚡ Building applications with LLMs through composability ⚡
rohd-hcl
A hardware component library developed with ROHD.
modelmesh-serving
Controller for ModelMesh
ghdl
VHDL 2008/93/87 simulator
conda-eda
Conda recipes for FPGA EDA tools for simulation, synthesis, place and route and bitstream generation.