top of page

Reinforcement Learning

        Reinforcement Learning was a graduate-level course on agents that must learn, plan, and act in complex, non-deterministic environments. The course featured study of the main theory and approaches of Reinforcement Learning (RL), along with common software libraries and packages used to implement and test RL algorithms. RL also featured a programming component in the form of assignments and a final project.

Topics included:

  • Multi-armed bandits

  • Q-Learning

  • MDPs

  • Dynamic Programming

  • Monte Carlo Methods

  • Temporal Difference Methods

  • n-Step TD

  • Planning and Learning

  • Function Approximation

  • Neural Networks for Control

Final Project

Reinforcement Learning required a final project which incorporated topics from the class. My group elected to create a stock-trading agent.

Final Project

        For a final project, my group created a stock-trading agent by combining various extensions of the DQN algorithm, sentiment analysis, technical indicators, autocorrelation data, and a policy we created called LocalMax to create a method which we called rTDQN


The salient building blocks of rTDQN can be seen here:


An example heatmap of autocorrelation data that was fed to the agent at each time step can be seen here:


Finally, we can see that our model (bottom-right) outperformed any other model in isolation, and that our LocalMax policy outperforms the standard greedy policy used for trading agents.


Our full paper can be found below:

Example Homework

Example Homework

Homework reports with associated programming assignments were also part of this course. An example report can be found below

bottom of page