Mark Crowley | Three great papers accepted to AAAI, NeurIPS Workshop and EMBC in past month!

Members of the UWECEML lab have had a good couple months, with a few notable papers accepted to great venues.

A New Approach to Scaling Decision Making to Many Agents

The most recently accepted paper (Ganapathi Subramanian et al., 2022) is a core topic in the final stages of my PhD student Sriram Ganapathi Subramanian into Multi-Agent Reinforcement Learning (MARL).

In situations where the number of agents is very large, we need to make some assumptions about structure to make RL feasible. Mean Field Theory is one such approach where each agent considers the impact to its interactions via another “cloud” agent which is actually an aggregation of all other agents, via a mean field calculation.

In this paper we relax a core requirement in existing Mean Field approaches which is that all agents use the same policy. Our new Decentralized Mean Field Game concept allows for agents to have separate policies but still use the mean field assumption about other agents to make decision making feasible. We show some convergence results for a provide a fixed point guarantee for a Q-learning based algorithm under this paradigm.

Natural Language Analysis for Medical Reports

This paper (Allada et al., 2021), led by recently graduated student Aishwarya Krishna Allada, makes a comparative analysis of various natural language embedding models on the novel domain of digital pathology reports and attempts to determine the strengths and weaknesses of different ways of dealing with these large and variable language datasets.

Empirical Study of Reinforcement Learning Algorithms in Multi-Agent Settings

This study (Lee et al., 2021) was led by fourth-year undergraduate student Ken Ming Lee with lots of guidance and help from my PhD student Sriram Ganapathi Subramanian. It’s a great empirical comparison of a number of single-agent and multi-agent RL algorithms on a standard set of MARL problems. Often single-agent algorithms are quickly hacked in order to to decision making on multi-agent domains and seem to work fairly well. Our motivating question was how often is this true and what kinds of problems require approaches that consider more dedicated multi-agent interaction.

References:

Mean Field MARL

Decentralized Mean Field Games

Sriram Ganapathi Subramanian, Matthew Taylor, Mark Crowley, and Pascal Poupart.

In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI-2022). Virtual. Feb, 2022.

Abs arXiv PDF URL Hypoth

Multiagent reinforcement learning algorithms have not been widely adopted in large scale environments with many agents as they often scale poorly with the number of agents. Using mean field theory to aggregate agents has been proposed as a solution to this problem. However, almost all previous methods in this area make a strong assumption of a centralized system where all the agents in the environment learn the same policy and are effectively indistinguishable from each other. In this paper, we relax this assumption about indistinguishable agents and propose a new mean field system known as Decentralized Mean Field Games, where each agent can be quite different from others. All agents learn independent policies in a decentralized fashion, based on their local observations. We define a theoretical solution concept for this system and provide a fixed point guarantee for a Q-learning based algorithm in this system. A practical consequence of our approach is that we can address a ‘chicken-and-egg’ problem in empirical mean field reinforcement learning algorithms. Further, we provide Q-learning and actor-critic algorithms that use the decentralized mean field learning approach and give stronger performances compared to common baselines in this area. In our setting, agents do not need to be clones of each other and learn in a fully decentralized fashion. Hence, for the first time, we show the application of mean field learning methods in fully competitive environments, large-scale continuous action space environments, and other environments with heterogeneous agents. Importantly, we also apply the mean field method in a ride-sharing problem using a real-world dataset. We propose a decentralized solution to this problem, which is more practical than existing centralized training methods.

NLP-DigiPath

Analysis of Language Embeddings for Classification of Unstructured Pathology Reports

Aishwarya Krishna Allada, Yuanxin Wang, Veni Jindal, Morteza Babaie, H.R. Tizhoosh, and Mark Crowley

In International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, Nov, 2021.

Abs

A pathology report is one of the most significant medical documents providing interpretive insights into the visual appearance of the patient’s biopsy sample. In digital pathology, high-resolution images of tissue samples are stored along with pathology reports. Despite the valuable information that pathology reports hold, they are not used in any systematic manner to promote computational pathology. In this work, we focus on analyzing the reports, which are generally unstructured documents written in English with sophisticated and highly specialized medical terminology. We provide a comparative analysis of various embedding models like BioBERT, Clinical BioBERT, BioMed-RoBERTa and Term Frequency-Inverse Document Frequency (TF-IDF), a traditional NLP technique, as well as the combination of embeddings from pre-trained models with TF-IDF. Our results demonstrate the effectiveness of various word embedding techniques for pathology reports.

MARLEmpircal

Investigation of Independent Reinforcement Learning Algorithms in Multi-Agent Environments

Ken Ming Lee, Sriram Ganapathi Subramanian, and Mark Crowley

In NeurIPS 2021 Deep Reinforcement Learning Workshop. Dec, 2021.

Abs arXiv

Independent reinforcement learning algorithms have no theoretical guarantees for finding the best policy in multi-agent settings. However, in practice, prior works have reported good performance with independent algorithms in some domains and bad performance in others. Moreover, a comprehensive study of the strengths and weaknesses of independent algorithms is lacking in the literature. In this paper, we carry out an empirical comparison of the performance of independent algorithms on four PettingZoo environments that span the three main categories of multi-agent environments, i.e., cooperative, competitive, and mixed. We show that in fully-observable environments, independent algorithms can perform on par with multi-agent algorithms in cooperative and competitive settings. For the mixed environments, we show that agents trained via independent algorithms learn to perform well individually, but fail to learn to cooperate with allies and compete with enemies. We also show that adding recurrence improves the learning of independent algorithms in cooperative partially observable environments.