Beast code in Giters

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Language:PythonApache-2.011558 202 2235

composer

Supercharge Your Model Training

Language:PythonApache-2.05120 49 541

trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Language:PythonMIT4448 49 289

alpa

Training and serving large-scale neural networks with auto parallelization.

Language:PythonApache-2.03047 45 296

t5x

Language:PythonApache-2.02640 36 140

RL4LMs

A modular RL library to fine-tune language models to human preferences

Language:PythonApache-2.02175 25 56

NeuroNER

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

Language:PythonMIT1690 79 151

smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Language:PythonMIT1183 21 87

torchxrayvision

TorchXRayVision: A library of chest X-ray datasets and models. Classifiers, segmentation, and autoencoders.

Language:Jupyter NotebookApache-2.0894 18 76

truss

The simplest way to serve AI/ML models in production

Language:PythonMIT882 15 121

enformer-pytorch

Implementation of Enformer, Deepmind's attention network for predicting gene expression, in Pytorch

Language:PythonMIT415 17 36

OMOP2OBO

OMOP2OBO: A Python Library for mapping OMOP standardized clinical terminologies to Open Biomedical Ontologies

Language:Jupyter NotebookMIT83 9 46

primock57

Dataset of 57 mock medical primary care consultations: audio, consultation notes, human utterance-level transcripts.

Language:PythonNOASSERTION34 300

LiST

Lite Self-Training

Language:PythonMIT29 6 3

ETAB

[ NeurIPS 2022 ] Official Codebase for "ETAB: A Benchmark Suite for Visual Representation Learning in Echocardiography"

Language:Jupyter Notebook25 3 2

ehr_ml

Code for doing machine learning with various EHRs

Language:C++MIT21 8 22

cotrain-prompting

Code for co-training large language models (e.g. T0) with smaller ones (e.g. BERT) to boost few-shot performance

Language:PythonMIT17 180

dataset

Multi-LexSum is an abstractive summarization dataset for US Civil Rights Lawsuits

Language:Jupyter Notebook1701

PheValuator

An R package for evaluating phenotype algorithms.

Language:R17 9 39

Twin_Causal_Nets

Estimating the probabilities of caution via deep monotonic twin networks

Language:PythonMIT1700

weaksup-subset-selection

Subset selection / data pruning for weak supervision

Language:PythonMIT14 10

real-time-admissions

Code to accompany paper published in Nature Digital Medicine

Language:RBSD-3-Clause800

parametric-robustness-evaluation

Code for paper "Evaluating Robustness to Dataset Shift via Parametric Robustness Sets"

Language:PythonMIT7 70

large-scale-temporal-shift-study

Code for Large-Scale Study of Temporal Shift in Health Insurance Claims. Christina X Ji, Ahmed M Alaa, David Sontag. CHIL, 2023. https://arxiv.org/abs/2305.05087

Language:Python4 7 6