menelaus (Young768)

Young768

Geek Repo

Company:Nvidia

Location:Santa Clara

Github PK Tool:Github PK Tool

menelaus's repositories

bigbird

Transformers for Longer Sequences

License:Apache-2.0Stargazers:0Issues:0Issues:0

CUDALibrarySamples

CUDA Library Samples

Language:C++License:NOASSERTIONStargazers:0Issues:1Issues:0

DeepLearningExamples

Deep Learning Examples

Language:PythonStargazers:0Issues:0Issues:0

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

License:MITStargazers:0Issues:0Issues:0
Language:C++Stargazers:0Issues:0Issues:0
Language:HTMLStargazers:0Issues:0Issues:0

google-research

Google Research

License:Apache-2.0Stargazers:0Issues:0Issues:0

iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:MLIRStargazers:0Issues:0Issues:0

jax

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

openshmem-examples

Some miscellaneous OpenSHMEM examples

Language:CStargazers:0Issues:1Issues:0

paxml

Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimentation and parallelization, and has demonstrated industry leading model flop utilization rates.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

profiling

some exp logs.

Language:PythonStargazers:0Issues:2Issues:0
Language:PythonStargazers:0Issues:0Issues:0

SHARK-dev

SHARK - High Performance Machine Learning for CPUs, GPUs, Accelerators and Heterogeneous Clusters

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

tensorflow

An Open Source Machine Learning Framework for Everyone

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

training

Reference implementations of MLPerf™ training benchmarks

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

triton

Development repository for the Triton language and compiler

Language:C++License:MITStargazers:0Issues:0Issues:0

TurboTransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

Language:C++License:NOASSERTIONStargazers:0Issues:0Issues:0

tutorials

PyTorch tutorials.

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:1Issues:0

vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

xla

A machine learning compiler for GPUs, CPUs, and ML accelerators

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0