robertalanm

Carro's repositories

safe-rlhf

Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Language:PythonApache-2.02 10

alpaca-weight

Train llama with lora on one 4090 and merge weight of lora to work as stanford alpaca.

Language:PythonMIT100

CodingSubnet

MIT1 10

langchain

⚡ Building applications with LLMs through composability ⚡

Language:PythonMIT100

reward-modeling

Language:Python100

text-generation-inference

Large Language Model Text Generation Inference

Language:PythonApache-2.0100

trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Language:PythonMIT100

airoboros

Customizable implementation of the self-instruct paper.

Language:PythonApache-2.0000

alpaca-lora

Code for reproducing the Stanford Alpaca InstructLLaMA result on consumer hardware

Language:Jupyter NotebookApache-2.0000

autocrit

A repository for transformer critique learning and generation

Language:Python000

axolotl

Go ahead and axolotl questions

Language:PythonApache-2.0000

ColossalAI

Making large AI models cheaper, faster and more accessible

Language:PythonApache-2.0000

direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Language:PythonApache-2.0000

discord

Language:PythonMIT010

gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

Language:PythonApache-2.0000

H3

Language Modeling with the H3 State Space Model

Apache-2.0000

langflow

⛓️ LangFlow is a UI for LangChain, designed with react-flow to provide an effortless way to experiment and prototype flows.

Language:TypeScriptMIT000

langfuse

open-source observability for LLM applications

Language:TypeScriptNOASSERTION000

langfuse-python

Language:Python000

llama-trl

LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA

Language:Jupyter NotebookApache-2.0000

minimal-llama

Language:Python000

OpenLLaMA2

A Ray-based High-performance LLaMA2 RLHF framework

Language:PythonApache-2.0000

opentensorAI-connector-template

000

orca

Experiments into reproducing orca

010

pfrl

PFRL: a PyTorch-based deep reinforcement learning library

MIT000

raodottown

website for rao.town

Language:JavaScript000

substrate-indexer

indexer for substrate chain (bt)

Language:TypeScriptMIT000

t-jepa

Language:Python010

validators

Repository for bittensor validators

Language:PythonMIT000

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.0000