Arushi Jain's starred repositories
direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
CoinBettingPolitex
Coin Betting Polietx (CBP) for CMDPs
lspi-python
Least Squares Policy Iteration (LSPI) in Python
graduate-fellowships
A list of resources to fund your MS and/or PhD, particularly in computer science.