"Intelligent" systems (e.g., RL agents, humans, strategic game-theoretic groups) dynamically process environmental information and adapt their behavior according to some protocol, often to achieve a defined goal. My research generally addresses the following "inverse" concepts: given observations of an intelligent system, how can we recover the underlying generative mechanisms which drive it? How can we use these uncovered mechanisms to predict or manipulate the system?
Below are a few examples, see my publications for a full list.
Suppose an external agent ("forward learner") performs stochastic gradient descent (SGD) to optimize a cost function. By observing sequential noisy gradients from this process, how can we ("inverse learner") reconstruct the cost function in its entirety?
We develop a passive stochastic gradient Langevin dynamics (PSGLD) algorithm to accomplish this, and provide finite-sample bounds for the reconstruction proximity. We exploit tools in the theory of Markov diffusion operators for our analysis.
Traditional inverse reinforcement learning assumes a Markov Decision Process (MDP) environment and demonstrations from a static optimal policy. Here we generalize: we assume only SGD in a generic space (we can specialize to MDP optimization by considering policy gradient algorithms), and observations from the dynamic transient regime of learning.
Representative Publication:
Finite-Sample Bounds for Adaptive Inverse Reinforcement Learning using Passive Langevin Dynamics
62nd IEEE conference on Decision and Control, 2023
[pdf]
The stable behavior of strategically interacting multi-agent systems is captured by equilibria concepts such as Nash equilibria. However, non-cooperative strategic equilibria often degrade the performance (utility attained) of each agent in the system. A goal of mechanism design is to fashion the game structure (mapping from agent actions to outcomes) such that non-cooperative interactions lead to outcomes which maximize the performance (utility) of all agents.
We provide a novel algorithmic approach to accomplishing mechanism design adaptively, by iteratively interacting with the system and observing Nash equilibria. Our framework allows for mechanism design to be achieved even when the designer has no knowledge of the agent utility functions. We exploit tools from microeconomic revealed preference theory.
Representative Publication:
Adaptive Mechanism Design using Multi-Agent Revealed Preferences
63rd IEEE Conference on Decision and Control, 2024
[pdf]
Suppose a human makes decisions which are influenced by an underlying state of nature. How can one detect a change in the underlying state of nature by only observing these human decisions? Such a scenario lends itself to e.g., detection of financial market shocks by observation of human investments or detection of adversarial strategy change by individual-level monitoring.
We exploit a novel model for human decision-making, which generalizes traditional behavioral economics models, to capture structural properties of an optimal change-point detector in this setting. We characterize several mathematical properties such as its threshold policy behavior and its dependence on model parameters, e.g., the dependence of detection performance on the "rationality" of observed decisions.
Representative Publication:
Quickest Detection for Human-Sensor Systems using Quantum Decision Theory
IEEE Transactions on Signal Processing, 2024
[pdf]