Peter Morgan's starred repositories
DeepSeek-MoE
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Look-into-MoEs
A Closer Look into Mixture-of-Experts in Large Language Models
refusal_direction
Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".
DeepSeek-Coder-V2
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
HiddenMambaAttn
Official PyTorch Implementation of "The Hidden Attention of Mamba Models"
samplernn-pytorch
PyTorch implementation of SampleRNN: An Unconditional End-to-End Neural Audio Generation Model
flash-linear-attention
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
scaling-with-vocab
📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
distributed-kge-poplar
The application is a end-user training and evaluation system for standard knowledge graph embedding models. It was developed to optimise the WikiKG90Mv2 dataset
ThoughtSource
A central, open resource for data and tools related to chain-of-thought reasoning in large language models. Developed @ Samwald research group: https://samwald.info/
amazon-sagemaker-generativeai
Repository for training and deploying Generative AI models, including text-text, text-to-image generation and prompt engineering playground using SageMaker Studio.
guidance-for-a-multi-tenant-generative-ai-gateway-with-cost-and-usage-tracking-on-aws
This Guidance demonstrates how to build an internal Software-as-a-Service (SaaS) platform that provides access to foundation models, like those available through Amazon Bedrock, to different business units or teams within your organization
RydbergGPT
Our LLM for Rydberg atom physics
zeroshot-classifier
Notebooks for training universal 0-shot classifiers on many different tasks
summer-school-transformers-2023
Course repository for the session "Hands-on Transformers: Fine-Tune your own BERT and GPT" of the Data Science Summer School 2023
NeMo-Skills
A pipeline to improve skills of large language models