- Modelled a reinforcement learning agent using TensorFlow that participates in stock trading by holding, selling, and buying the stocks. Unlike the (Un)supervised models that only make predictions
- Implemented Value & Howard Policy Iterations to find optimal policy for Markov Decision Problem
- Experimented with Vanilla Gradient Descentand and implemented Momentumfor faster training