Reinforcement Learning

Spring 2024 - ECE 457C

ECE 457C - Reinforcement Learning

Offered Spring 2024 by Prof. Mark Crowley

Course Description

Introduction to Reinforcement Learning (RL) theory and algorithms for learning decision-making policies in situations with uncertainty and limited information. Topics include Markov decision processes, classic exact/approximate RL algorithms such as value/policy iteration, Q-learning, State-action-reward-state-action (SARSA), Temporal Difference (TD) methods, policy gradients, actor-critic, and Deep RL such as Deep Q-Learning (DQN), Asynchronous Advantage Actor Critic (A3C), and Deep Deterministic Policy Gradient (DDPG).

Course Outline

See the official course outline at : https://outline.uwaterloo.ca/view/n9ffkw for course location, times, staff contact and other information .

This page will have additional resources linking to previous courses, topic notes etc, which may also be duplicated on LEARN.



Course News and Announcements

These will generally be posted into LEARN so that everyone can get a notification of announcements and updates. Be sure to enable notifications for course announcements.


2024-05-07

Lecture 2 on May 7, 2024 cancelled - Assigned Reading

Nice meeting everyone yesterday, I’m sure it’s going to be a fun term. Some information for you all.

  • There is no lecture today Tuesday May 7, 2024 - instead you should look at the following materials and reading on LEARN
    • Multi-Armed Bandits (Slides) - I started going through these slides, you can read through the rest. There are slides form last year with my annotations and additional resources on this topic to fill in the ideas.
    • If your probability concepts are a bit rusty then look through the materials in the “Review of Probability Theory” section. We don’t need anything too advanced, if everything in ece457cprobreview makes sens to you then you’re fine. If not, then take a look and start a discussion on piazza as needed.
  • My original idea that we sometimes had Wednesday lectures was incorrect, so I’ll see you all again Monday.
  • I’ll create office hours for later this week.
  • The dates for Assignment 1 on the outline were incorrect, they are being updated.

2024-04-26

Welcome Everyone! Let’s Talk AI!

In this announcement:

  • Course Communication Information
  • Midterm Date
  • An Erratic Planned Lecture Schedule
    • First Lecture(s) : Monday May 6, 2024 - 3 hours on first day

Course Communication Processes

Hi Everyone! Looking forward to another fun spring term talking about one of the most interesting areas of modern AI these days, Reinforcement Learning! We’ll be using Piazza and Learn for discussion and content, you know the drill by now! News and announcements will be posted on Learn and piazza, as well as initially on my website for the course (https://markcrowley.ca/rlcourse/) for people who aren’t registered yet.

Boilerplate: We’ll be conducting all class-related discussion here this term. The quicker you begin asking questions on Piazza (rather than via emails), the quicker you’ll benefit from the collective knowledge of your classmates and instructors. We encourage you to ask questions when you’re struggling to understand a concept. For course communications, you can even do so anonymously to the course staff and only the TAs and Prof will know who it is.

Important: For direct contact with the Prof or TAs on individual issues on anything personal or administrative, please use private piazza messages so that all three of us get a chance to respond. Prof. Crowley is “bad at email” (…it’s true), so it might get missed. If you really have a critical time issue, or you don’t want the TA’s to see the message (for any reason at all), then you should send Prof. Crowley a direct message on Teams chat. But please use this method only when critical.

Midterm

The course will have a Midterm Exam and a Final Exam this year. The Midterm is already scheduled :

  • Midterm Exam: June 17, 11am to 12:15pm, in room E7 4043
  • Full ECE Midterm Schedule : https://uwaterloo.ca/electrical-computer-engineering/midterm-schedule-0

An Erratic Planned Lecture Schedule

Due to having Monday lectures, and a large number of holidays, conferences, and other events, our “regular weekly schedule” is far from regular.

  • “Regular” week: Two 1.5 hour lectures a week on Mondays and Tuesdays.
  • ButSome weeks we have a total of 1.5 hours of lecture, some weeks we have 6 hours
  • Notably, the very first week we have
    • Monday (in person): 3 hours all at once on May 6 on the first day of class
    • Tuesday (cancelled): 1.5 hours are lecture are scheduled for May 7 but I will need to cancel it, see note below.
    • Wednesday (virtual): 1.5 hours (I was looking at an old schedule, there are no wednesday lectures)

This means we’ll be jumping right in with full content that first day. We will certainly have a break and some ice breaker discussions to let people chat, and discuss group partners, and ask general course questions.

So come to class Monday and be ready to talk about th

Additional Lecture Complications

In addition to the planned schedule, from time to I will need to cancel a lecture, or provide it virtually with as a live online lecture on Teams. This is partly because of a family health situation that requires me to be in Toronto a fair bit this term. One or two weeks I’ll have be attending a conference and will switch to some lecture being virtual-live or virtual-pre-recorded.

If you have any concerns or inputs on the how I can best do all of this in a way that supports your learning, please do contact me.