Reinforcement Learning - Reading List
Reading list for Reinforcement Learning
The papers listed below are a loose superset of the ones I try to cover, or have students present, throughout the term during my .
And Lo, the legends tell us, that before there was even the DQN, there was the incredible VFAs, and before this the Great Age of the Value Tables themselves.
Foundational
papers fill the first part of the list.
Once we enter the era of Deep Reinforcement Learning, the papers are grouped into early
, middle
, or later
for how they would fall in a graduate course on Advanced RL such as my RL Courses (ECE 457C/657C). Other topics categories relate to general reference
, papers mostly about environments
to test out RL algorithms, or potential
paper for future reading.
Further ordering with [n] is sometime listed but this is just a rough guide. In general, if one paper is based on work in an older paper, the older paper should be discussed first, or the same week.
(You can jump to any stage with these links to find a paper)
foundational ~ early ~ middle ~ later ~ reference
foundational
- [1] Dynamic ProgrammingPrinceton University Press, New Jersey. 1957.
- [2] Modified Policy Iteration Algorithms for Discounted Markov Decision ProblemsManagement Science. 24, (11). 1978.
- [6] Natural Actor CriticIn European Conference on Machine Learning. Springer Verlag, Berlin, 2005.
- [6] Actor-Critic AlgorithmsIn Advances in Neural Information Processing Systems. MIT Press, 1999.
- [8] Learning from Delayed RewardsUK. 1989.
- [9] Policy Gradient Methods for Reinforcement Learning with Function Approximation. 12, MIT Press, 1999.
- [9] Natural gradient works efficiently in learning.Neural Computation. 10, 1998.
- [9] Neuro-Dymanic ProgrammingAthena Scientific, Nashua, NH.. 1996.
- [9] Simple statistical gradient-following algorithms for connectionist reinforcement learningMachine Learning. 7, (2). 1992.
- [10] Neurocontrol and Supervised Learning: An Overview and Evaluation1992.
early
- Shallow[1] State of the Art Control of Atari Games Using Shallow Reinforcement LearningIn Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems. International Foundation for Autonomous Agents and Multiagent Systems, Richland, South Carolina, USA. 2016.DISCUSSED ON: 2024-10-04 by Mark Crowley
- HER[3] Hindsight Experience ReplayIn Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA. 2017.
- PER[3] Prioritized Experience ReplayIn 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. 2016.DISCUSSED ON: 2024-10-11 by Mark Crowley?
- Revisit-ER[5] Revisiting fundamentals of experience replayIn International Conference on Machine Learning. 2020.
- A3C[6] Asynchronous Methods for Deep Reinforcement LearningIn Proceedings of The 33rd International Conference on Machine Learning (ICML). 2016.DISCUSSED ON: 2024-10-11 by Mark Crowley
- [6] Human-level control through deep reinforcement learningNature. 518, (7540). 2015.
- [7] Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actorIn International conference on machine learning. 2018.
middle
- Distributional reinforcement learningMIT Press, 2023.
- Attention option-criticarXiv preprint arXiv:2201.02628. 2022.
- The starcraft multi-agent challengearXiv preprint arXiv:1902.04043. 2019.
- [10] Reinforcement Learning as a Framework for Ethical Decision MakingIn AAAI Workshop: AI, Ethics, and Society. 2016.
- [19] Deep Hedging with Market ImpactIn . Canadian Artificial Intelligence Association (CAIAC), May, 2024.
later
- RLHF
- MORAL
- MoralityInterpret[6] Morality, Machines, and the Interpretation Problem: A Value-based, Wittgensteinian Approach to Building Moral AgentsIn Artificial Intelligence XXXIX. Springer International Publishing, Cham. 2022.
- ChemGymRL[19] ChemGymRL: An Interactive Framework for Reinforcement Learning for Digital ChemistryarXiv preprint arXiv:2305.14177. 2023.
</div>
Other Reading
The following sections list additional relevant publications that are optional for my course. They mostly reference
papers and books, simulation environment descriptions, or other resources that may also prove useful in understanding the course topic.
potential
publications are even more optional, and probably wouldn’t have time to be discussed in the course unless someone volunteers to present them.
reference ~ environment ~ potential
reference
- [5] The Art of Reinforcement Learning: Fundamentals, Mathematics, and Implementations with PythonApress Berkeley, CA, 2023.
environment
- Minerl: A large-scale dataset of minecraft demonstrationsarXiv preprint arXiv:1907.13440. 2019.
- The starcraft multi-agent challengearXiv preprint arXiv:1902.04043. 2019.
- Openai gymarXiv preprint arXiv:1606.01540. 2016.
- [19] Deep Hedging with Market ImpactIn . Canadian Artificial Intelligence Association (CAIAC), May, 2024.
- ChemGymRL[19] ChemGymRL: An Interactive Framework for Reinforcement Learning for Digital ChemistryarXiv preprint arXiv:2305.14177. 2023.
potential
- Symphony: Learning Realistic and Diverse Agents for Autonomous Driving SimulationIn International Conference on Robotics and Automation (ICRA). 2022.
- UAV Coverage Path Planning under Varying Power Constraints Using Deep Reinforcement LearningIn 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020.
- Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical Encoder for Long-Form Document MatchingIn Proceedings of the 29th ACM International Conference on Information & Knowledge Management. ACM, Virtual Event Ireland. Oct, 2020.
- Model-ensemble trust-region policy optimizationarXiv preprint arXiv:1802.10592. 2018.
- A distributional perspective on reinforcement learningIn International conference on machine learning. 2017.
- Constrained policy optimizationIn International conference on machine learning. 2017.
- Continuous control with deep reinforcement learningarXiv preprint arXiv:1509.02971. 2015.
- Knows What It Knows: A Framework For Self-Aware LearningProceedings of the 25th International Conference on Machine Learning. 2008.
- PAC Model-Free Reinforcement LearningUpdate. 2006.