Reinforcement Learning was a graduate-level course on agents that must learn, plan, and act in complex, non-deterministic environments. The course featured study of the main theory and approaches of Reinforcement Learning (RL), along with common software libraries and packages used to implement and test RL algorithms. RL also featured a programming component in the form of assignments and a final project.
Monte Carlo Methods
Temporal Difference Methods
Planning and Learning
Neural Networks for Control
Reinforcement Learning required a final project which incorporated topics from the class. My group elected to create a stock-trading agent.
For a final project, my group created a stock-trading agent by combining various extensions of the DQN algorithm, sentiment analysis, technical indicators, autocorrelation data, and a policy we created called LocalMax to create a method which we called rTDQN.
The salient building blocks of rTDQN can be seen here:
An example heatmap of autocorrelation data that was fed to the agent at each time step can be seen here:
Finally, we can see that our model (bottom-right) outperformed any other model in isolation, and that our LocalMax policy outperforms the standard greedy policy used for trading agents.
Our full paper can be found below:
Homework reports with associated programming assignments were also part of this course. An example report can be found below