ECE750T4 Reinforcement Learning - Reading List
Reading list for the Grad Topics on Reinforcement Learning (ECE 750 Topic 4) for Fall 2024
Introduction
This course will have two components, which shift focus over the term. The first component will be lectures and work problems about the fundamentals of Reinforcement Learning (RL).
The second component will be communal presentation and discussion of research papers on advanced topics in RL. The papers will be read by all, presented briefly by a student, and then discussed by everyone, led by the presenting student, for half of the class period.
This second component is structured around the common grad school practice of Reading Groups, see more information below or see the pages for some previous reading groups here.
Reading Groups Tips
In a reading group everyone takes turns leading discussion of a paper each week. Leading discussion can be as simple as having your own annotated notes on Hypothes.is to share and start discussion as we go through it together. Or it could be more involved, including making slides to present your overview of the paper’s contributions, highlights and weak points.
The Course Reading List
Papers listed below are ones that we are planning to read throughout the term. An order [n] is sometime listed but this is just a rough guide. In general, if one paper is base don work in an older paper, the older paper should be discussed first, or the same week.
What you need to do
If you are a student in this course you need to do the following:
- Create a Hypothes.is account and sign up for the course group on Hypothes.is so you and everyone in the class can see our shared annotations
- Pick the papers you will be reading, presenting, and leading discussion of and sign up (sign-up process TBD)
- PhD Students: choose two papers, one of them near the start to set a good example!
- Master’s Students: choose at least one paper
- Then read the paper in detail, use Hypothes.is to make annotations for yourself and to guide others. Use the Hypothes.is course group you were all invited to do to this.
- Prepare to present the main points of the paper, and guide discussion through the parts that are surprising, challening, interetsing, or that you don’t understand.
- The class discussion of the paper should help everyone, including you and the prof come away with a better understanding and evaluation of this publication
Papers We’ll Be Reading
See the links below for information about and notes on papers
planned readings for some time in the early
, middle
, or later
part of the course,
obtain the link for the current
reading for this week,
or for those that are done
from previous weeks.
(Jump to a stage and sign up to lead a paper)
current ~ early ~ middle ~ later ~ done
current
early
- Shallow[1] State of the Art Control of Atari Games Using Shallow Reinforcement LearningIn Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems. International Foundation for Autonomous Agents and Multiagent Systems, Richland, South Carolina, USA. 2016.DISCUSSED ON: 2024-10-04 by Mark Crowley
- HER[3] Hindsight Experience ReplayIn Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA. 2017.
- PER[3] Prioritized Experience ReplayIn 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. 2016.DISCUSSED ON: 2024-10-11 by Mark Crowley?
- Revisit-ER[5] Revisiting fundamentals of experience replayIn International Conference on Machine Learning. 2020.
- A3C[6] Asynchronous Methods for Deep Reinforcement LearningIn Proceedings of The 33rd International Conference on Machine Learning (ICML). 2016.DISCUSSED ON: 2024-10-11 by Mark Crowley
- [6] Human-level control through deep reinforcement learningNature. 518, (7540). 2015.
- [7] Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actorIn International conference on machine learning. 2018.
middle
- Distributional reinforcement learningMIT Press, 2023.
- Attention option-criticarXiv preprint arXiv:2201.02628. 2022.
- The starcraft multi-agent challengearXiv preprint arXiv:1902.04043. 2019.
- [10] Reinforcement Learning as a Framework for Ethical Decision MakingIn AAAI Workshop: AI, Ethics, and Society. 2016.
- [19] Deep Hedging with Market ImpactIn . Canadian Artificial Intelligence Association (CAIAC), May, 2024.
later
- RLHF
- MORAL
- MoralityInterpret[6] Morality, Machines, and the Interpretation Problem: A Value-based, Wittgensteinian Approach to Building Moral AgentsIn Artificial Intelligence XXXIX. Springer International Publishing, Cham. 2022.
- ChemGymRL[19] ChemGymRL: An Interactive Framework for Reinforcement Learning for Digital ChemistryarXiv preprint arXiv:2305.14177. 2023.
done
- Shallow[1] State of the Art Control of Atari Games Using Shallow Reinforcement LearningIn Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems. International Foundation for Autonomous Agents and Multiagent Systems, Richland, South Carolina, USA. 2016.DISCUSSED ON: 2024-10-04 by Mark Crowley
Other Reading
The following sections list publications that no one needs to volunteer to present, they mostly reference
papers and books, simulation environment descriptions, or other resources that may also prove useful in understanding the course topic.
potential
publications probably won’t be discussed if no one volunteers to present them.
(Not to sign up for, just for reference and interest)
reference ~ foundational ~ environment ~ potential
reference
- Artificial intelligence a modern approachPearson Education, Inc., 2010.
- [0] The Art of Reinforcement Learning: Fundamentals, Mathematics, and Implementations with PythonApress Berkeley, CA, 2023.
foundational
- Natural Actor CriticIn European Conference on Machine Learning. Springer Verlag, Berlin, 2005.
- Policy Gradient Methods for Reinforcement Learning with Function Approximation. 12, MIT Press, 1999.
- Natural gradient works efficiently in learning.Neural Computation. 10, 1998.
- Neuro-Dymanic ProgrammingAthena Scientific, Nashua, NH.. 1996.
- Simple statistical gradient-following algorithms for connectionist reinforcement learningMachine Learning. 8, (2). 1992.
- Neurocontrol and Supervised Learning: An Overview and Evaluation1992.
- Learning from Delayed RewardsUK. 1989.
- Modified Policy Iteration Algorithms for Discounted Markov Decision ProblemsManagement Science. 24, (11). 1978.
- Dynamic ProgrammingPrinceton University Press, New Jersey. 1957.
environment
- Minerl: A large-scale dataset of minecraft demonstrationsarXiv preprint arXiv:1907.13440. 2019.
- The starcraft multi-agent challengearXiv preprint arXiv:1902.04043. 2019.
- Openai gymarXiv preprint arXiv:1606.01540. 2016.
- [19] Deep Hedging with Market ImpactIn . Canadian Artificial Intelligence Association (CAIAC), May, 2024.
- ChemGymRL[19] ChemGymRL: An Interactive Framework for Reinforcement Learning for Digital ChemistryarXiv preprint arXiv:2305.14177. 2023.
potential
- Symphony: Learning Realistic and Diverse Agents for Autonomous Driving SimulationIn International Conference on Robotics and Automation (ICRA). 2022.
- UAV Coverage Path Planning under Varying Power Constraints Using Deep Reinforcement LearningIn 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020.
- Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical Encoder for Long-Form Document MatchingIn Proceedings of the 29th ACM International Conference on Information & Knowledge Management. ACM, Virtual Event Ireland. Oct, 2020.
- Model-ensemble trust-region policy optimizationarXiv preprint arXiv:1802.10592. 2018.
- A distributional perspective on reinforcement learningIn International conference on machine learning. 2017.
- Constrained policy optimizationIn International conference on machine learning. 2017.
- Continuous control with deep reinforcement learningarXiv preprint arXiv:1509.02971. 2015.
- Knows What It Knows: A Framework For Self-Aware LearningProceedings of the 25th International Conference on Machine Learning. 2008.
- PAC Model-Free Reinforcement LearningUpdate. 2006.