Reinforcement Learning

RL is the study of learning decision making policies from experience with computers.

One of my core research areas is into understanding the computational mechanisms that can enable learning to perform complex tasks primarily from experience and feedback. This topic, called Reinforcement Learning, has a complex history tying fields as diverse as neuroscience, behavioural and development psychology, economics and computer science. I approach it as a computational researcher aiming to build Artificial Intelligence agents that learn to way Humans do, not by any correspondence of their “brain” and it “neural” structure by the algorithms they both use to learn to act in a complex, mysterious world.

Learning Resources from The Lab

External Resources

  • Revised Textbook by Sutton and Barto - http://incompleteideas.net/book/the-book-2nd.html
  • Martha White has a great RL Fundamentals Course
  • Sergey Levine has a very detailed Deep RL Course

Our Papers on Reinforcement Learning

  1. Dynamic Observation Policies in Observation Cost-Sensitive Reinforcement Learning
    Colin Bellinger, Mark Crowley, and Isaac Tamblyn.
    In Workshop on Advancing Neural Network Training: Computational Efficiency, Scalability, and Resource Optimization (WANT@NeurIPS 2023). New Orleans, USA. 2023.
  2. ChemGymRL: An Interactive Framework for Reinforcement Learning for Digital Chemistry
    Chris Beeler, Sriram Ganapathi Subramanian, Colin Bellinger, Mark Crowley, and Isaac Tamblyn.
    In NeurIPS 2023 AI for Science Workshop. New Orleans, USA. Dec, 2023.
  3. ChemGymRL
    Demonstrating ChemGymRL: An Interactive Framework for Reinforcement Learning for Digital Chemistry
    Chris Beeler, Sriram Ganapathi Subramanian, Kyle Sprague, Mark Crowley, Colin Bellinger, and Isaac Tamblyn.
    In NeurIPS 2023 AI for Accelerated Materials Discovery (AI4Mat) Workshop. New Orleans, USA. Dec, 2023.
  4. ChemGymRL
    ChemGymRL: An Interactive Framework for Reinforcement Learning for Digital Chemistry
    Chris Beeler, Sriram Ganapathi Subramanian, Kyle Sprague, Nouha Chatti, Colin Bellinger, Mitchell Shahen, Nicholas Paquin, Mark Baula, Amanuel Dawit, Zihan Yang, Xinkai Li, Mark Crowley, and Isaac Tamblyn.
    In ICML 2023 Synergy of Scientific and Machine Learning Modeling (SynS&ML) Workshop. Jul, 2023.
  5. Multi-Advisor-QL
    Multi-Agent Advisor Q-Learning
    In International Joint Conference on Artificial Intelligence (IJCAI) : Journal Track. Macao, China. Aug, 2023.
  6. Balancing Information with Observation Costs in Deep Reinforcement Learning
    Colin Bellinger, Andriy Drozdyuk, Mark Crowley, and Isaac Tamblyn.
    In Canadian Conference on Artificial Intelligence. Canadian Artificial Intelligence Association (CAIAC), Toronto, Ontario, Canada. May, 2022.
  7. Multi-Advisor-MARL
    Learning from Multiple Independent Advisors in Multi-agent Reinforcement Learning
    In Proceedings of the 22nd International Conference on Autonomous Agents and MultiAgent Systems (AAMAS). International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS), London, United Kingdom. Sep, 2023.
  8. Scientific Discovery and the Cost of Measurement – Balancing Information and Cost in Reinforcement Learning
    Colin Bellinger, Andriy Drozdyuk, Mark Crowley, and Isaac Tamblyn.
    In 1st Annual AAAI Workshop on AI to Accelerate Science and Engineering (AI2ASE). Feb, 2022.
  9. Mean Field MARL
    Decentralized Mean Field Games
    Sriram Ganapathi Subramanian, Matthew Taylor, Mark Crowley, and Pascal Poupart.
    In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI-2022). Virtual. Feb, 2022.
  10. MARLEmpircal
    Investigation of Independent Reinforcement Learning Algorithms in Multi-Agent Environments
    Frontiers in Artificial Intelligence. Sep, 2022.
  11. MARLEmpircal
    Investigation of Independent Reinforcement Learning Algorithms in Multi-Agent Environments
    In NeurIPS 2021 Deep Reinforcement Learning Workshop. Dec, 2021.
  12. Multi-Advisor-QL
    Multi-Agent Advisor Q-Learning
    Journal of Artificial Intelligence Research (JAIR). 74, May, 2022.
  13. A Complementary Approach to Improve WildFire Prediction Systems.
    Sriram Ganapathi Subramanian, and Mark Crowley
    In Neural Information Processing Systems (AI for social good workshop). NeurIPS. 2018.
  14. PO-MFRL
    Partially Observable Mean Field Reinforcement Learning
    Sriram Ganapathi Subramanian, Matthew Taylor, Mark Crowley, and Pascal Poupart.
    In Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS). International Foundation for Autonomous Agents and Multiagent Systems, London, United Kingdom. May, 2021.
  15. Amrl
    Active Measure Reinforcement Learning for Observation Cost Minimization: A framework for minimizing measurement costs in reinforcement learning
    Colin Bellinger, Rory Coles, Mark Crowley, and Isaac Tamblyn.
    In Canadian Conference on Artificial Intelligence. Springer, 2021.
  16. Deep Multi Agent Reinforcement Learning for Autonomous Driving
    Sushrut Bhalla, Sriram Ganapathi Subramanian, and Mark Crowley
    In Canadian Conference on Artificial Intelligence. May, 2020.
  17. Learning Multi-Agent Communication with Reinforcement Learning
    Sushrut Bhalla, Sriram Ganapathi Subramanian, and Mark Crowley
    In Conference on Reinforcement Learning and Decision Making (RLDM-19). Montreal, Canada. 2019.
  18. Training Cooperative Agents for Multi-Agent Reinforcement Learning
    Sushrut Bhalla, Sriram Ganapathi Subramanian, and Mark Crowley
    In Proc. of the 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2019). Montreal, Canada. 2019.
  19. Learning Forest Wildfire Dynamics from Satellite Images Using Reinforcement Learning
    Sriram Ganapathi Subramanian, and Mark Crowley
    In Conference on Reinforcement Learning and Decision Making. Ann Arbor, MI, USA.. 2017.
  20. Policy Gradient Optimization Using Equilibrium Policies for Spatial Planning Domains
    markcrowley.
    In 13th INFORMS Computing Society Conference. Santa Fe, NM, United States. 2013.
  21. phd-thesis
    Equilibrium Policy Gradients for Spatiotemporal Planning
    markcrowley.
    UBC Library, Vancouver, BC, Canada.. 2011.