Beast code in Giters

Ted Moskovitz's repositories

TOP

Implementation of Tactical Optimistic and Pessimistic value estimation

Language:Python2100

WNPG

implementation of Wasserstein Natural Policy Gradients and Wasserstein Natural Evolution Strategies

Language:Python10 10

ConstrainedRL4LMs

A library for constrained RLHF.

Language:Python700

directorv3

Mastering Diverse Domains through World Models

Language:PythonMIT200

reinforcement_learning

My solutions to Denny Britz's short course on RL.

Language:Jupyter NotebookMIT200

first_occupancy

A First Occupancy Representation for Reinforcement Learning

Language:Jupyter Notebook100

SVGD

Stein Variational Gradient Descent

Language:Jupyter Notebook100

bayesian_modeling

A collection of simple Bayesian machine learning methods implemented on toy data.

Language:Python000

Computational_Decipherment

Applying deep learning and other machine learning methods to the decipherment of ancient writing systems.

Language:Java000

ConvRNN_Analysis

Analyze Biologically-Realistic Convolutional Recurrent Networks

Language:Jupyter Notebook000

GA_TSP

A simple genetic algorithm (GA) for solving the travelling salesman problem.

Language:Jupyter Notebook000

LambdaRepresentation

Lambda Representation for Diminishing Marginal Utility

Language:Python000

SimpleCUDA

Simple Neural Network in CUDA

Language:Cuda000

tvpo

An implementation of Total Variation Policy Optimization (TVPO)

Language:Python000

DeepEncoding

Language:Jupyter Notebook000

DeepLearning_Thesis

A sample of code from my thesis at Princeton applying deep learning models to neural spike data.

Language:Jupyter Notebook000

Feedback_Alignment

Investigating biologically-plausible implementations of the backpropagation algorithm.

Language:Jupyter Notebook000

minRLHF

A (somewhat) minimal library for finetuning language models with PPO on human feedback.

000

nanoGPTConstraints

MIT000

PracticeCpp

Simple C++ Programs

Language:C++000

RAM_CIFAR10

Language:Python000

simple_database

Language:Python000

SimplePPO

A Simple, Easily-Customizable, Fully Jitted PPO Implementation in Jax

Language:Jupyter Notebook000

TDSR_python

successor representation for RL

Language:Python000

utils

Language:Python000