ECE 750 Topic 40 - Reinforcement Learning

Offered Fall 2024 by Prof. Mark Crowley

Course Description

This advanced topics graduate course will focus on the theories, methods and applications of Reinforcement Learning (RL). RL is an Artificial Intelligence/Machine Learning (AI/ML) approach for building systems that can learn how to make decisions through their own experiences in an environment. The domain is more difficult than supervised ML since it involves uncertainty and limited information about how the world, and its dynamics, actually function. It can also be seen the AI analogy for the Optimal Control problem, where there are no dynamics models available and the objective is not globally known.

Foundations First…

This course should follow after learning about foundational theory and methods from Artificial Intelligence and Machine Learning, including Deep Learning methods. The foundations of RL theory will be covered in detail as needed to support understanding the recent explosion of interest and progress on Deep Reinforcement Learning methods over the past decade.

Reading Papers on the Latest in Deep RL

After the foundations are covered, a series of papers from recent years will be presented by students and discussed in class to see the relationships between concepts and to gain practice viewing advanced AI/ML publications critically and analytically.

See the course reading list [in progress] for an idea of the papers we will cover and how to sign up for pressentations.

Expected Background

No background in Reinforcement Learning is required for this course, although we will move quite quickly through the classic concepts and background. A background in control or decision making theory would be beneficial but not required.

The course will use standard concepts from probability and statistics, which will be assumed. It is assumed students are already have a background in standard Machine Learning concepts and practices, especially core Deep Learning concepts for fully connected networks and CNNs. We will go into depth on the various approaches for utilizing Deep Learning for Reinforcement Learning, and sequential machine learning more generally, but we will not spend much time looking at the definitions of neural networks, deep learning, CNNs, machine learning training methodology, etc. Thus, if you do not have this background, you should consider taking this course after a course such as ECE 657 or equivalent.

All other concepts needed for the course will be introduced directly. Examples and some assignments will depend on programming ability in Python.

Course Outline

See the official course outline for course location, times, staff contact and other information .

Additional Resource Links

This page will have additional resources linking to previous courses, topic notes etc, which may also be duplicated on LEARN.

Textbook (online, free)
From my foundational undergraduate course on the same topic: ECE 457C YouTube Channel

Learning Objectives

In this course we will build up the fundamental knowledge about these components and how they combine together to make such systems possible.

Identify and Explain the component theoretical concepts of Reinforcement Learning systems.
Implement or instantiate any of the classic Reinforcement Learning algorithms on a variety of domains.
Evaluate the performance of a particular RL system on a given domain through proper experimental design, statistical analysis and visualization.
Be able to write mathematical notation electronically using LaTeX (also useful for Word/Google Docs/Markdown).
Practice and feedback on reviewing, summarizing and utilizing theory and results from academic papers.

Resources/References

The Foundations of RL will be taught from online sources and the seminal text by Sutton and Barto available freely on line.

[SuttonBarto2018] - Reinforcement Learning: An Introduction. Book, free pdf of draft available. http://incompleteideas.net/book/the-book-2nd.html Later advances in Deep RL and influence of RL concepts and methods in society will be covered through readings of academic papers, which students will present in class and discuss with everyone. This list will change and update over time.
Playing Atari with Deep Reinforcement Learning, Mnih et al, 2013.
Prioritized Experience Replay, Schaul et al, 2015.
Rainbow: Combining Improvements in Deep Reinforcement Learning, Hessel et al, 2017.
Asynchronous Methods for Deep Reinforcement Learning, Mnih et al, 2016.
Proximal Policy Optimization Algorithms, Schulman et al, 2017.
Continuous Control With Deep Reinforcement Learning, Lillicrap et al, 2015.
and more from “Key Papers in Deep RL” - https://spinningup.openai.com/en/latest/spinningup/keypapers.html
A Dual-Agent Scheduler for Distributed Deep Learning Jobs on Public Cloud via Reinforcement Learning. Xing, et. al, 2023. ACM SIGKDD Conference on Knowledge Discovery and Data Mining.
Two Theories of Moral Cognition. [[JuliaHaas]], 2022. Chapter in The International Library of Ethics, Law and Technology book series (ELTE,volume 22).

Assessment

Assessment in the course breaks down as follows:

20% Programming assignment on Fundaments of RL
20% Programming assignment on Deep RL
25% Midterm Exam on Fundamentals of RL and Deep RL
35% Written report/paper/project on a topic covered in one or more paper from the reading list (can be done alone or in pairs, research based students are encouraged to work alone on this)
- 10% Presentation of a paper from the required reading list that will be used in their project. Leading discussion with class about the paper. If working in pairs, both partners must present and discuss for roughly equal amounts of time.
- 10% Related to at least one paper from the Reading List:
  - Coding: Implement and test the algorithm in use in that paper, or use some of it’s concepts to modify some other existing RL algorithm.
  - Theory: Make some theoretical analysis, proof, or claim using one, or more, of the required papers.
  - Could also challenge/examine the paper’s experiments, or combine of elements from multiple papers and carry out your own experiments.
- 15% Written report on what you did, summarizing the related paper(s), and explaining the work you did extending it, implementing it, combining it with another paper, or confirming it. The paper must be written in the style of a Machine Learning conference paper. A LaTeX template will be given.

Exam Reference Notes

Note that the midterm exam and final exam will be entirely on paper in person, so any reliance you have on GenAI tools will not be available to you for the bulk of the course grades. For the midterm exam you will be permitted to bring in a couple pages of handwritten reference notes (a.k.a. “cheat sheets). These must be submitted with your exam and will returned afterwards if requested.

For more information also see the UWaterloo Policy Page on Generative AI.

Course Communication Processes

LEARN : Announcements will be sent out on LEARN, so you should set up your email notifications to forward any course announcements so you find out about them. All course lecture content, assignment materials, and grades will be hosted on LEARN as well.
Microsoft Teams: We will also use Microsoft Teams for communication and discussion. Announcements about the course will be posted there in a main channel. There will be a setup for paper discussion and each student presentation and report should be posted there for others to read as well.
We will try to use Teams for answering of questions on course material, but if it becomes too unwieldy let me know, we could use a more traditional discussion platform such as Piazza. However, since we have no Teaching Assistants for this course, I might not be as responsive to questions on Piazza as Teams.