Control, Learning and Adaptation in Information-Constrained, Adversarial Environments

Jump to: navigation, search

Technical point of contact: Phil Root, DARPA
Period of activity: 2019-2022

Overview of the Project

The objective of this project is to develop theory, algorithms and case studies for the control, learning and adaptation for autonomous agents that patrol in information-constrained, adversarial, safety-critical environments.

This project will develop a theoretical and algorithmic foundation that will help create autonomous robotic agents capable of executing patrol missions in urban environments possibly in mixed teams of a set of autonomous robotic agents with heterogeneous sensing, perception, computation and actuation capabilities and a smaller number of soldiers (possibly in a supervisory role). To this end, we will formalize a range of problems---some of which are considered for the first time---in the context of partial-information, stochastic games.


While partial-information, stochastic games provide a highly expressive modeling language, synthesis of strategies in such games subject to temporal and logical constraints in their general form is known to be algorithmically impractical. Therefore, we plan to establish trade-offs between expressivity of the problems and their algorithmic and computational tractability through a hierarchy of abstractions.

  • Thrust I – Synthesis in partial-information, stochastic games: We adopt partial-information, stochastic, two-player games played over finite graphs as the base model. Additionally, we append this setting with temporal logic specifications in order to capture the constraints on the evolution of the plays in the game as well as of the knowledge of the patroller. Thrust I will develop approaches to suppress the computational complexity in synthesis with this modeling class and to extract strategies that balance the induced risk, ambiguity and randomization:
    • Task I.1 – Strategy synthesis via belief set abstractions
    • Task I.2 – Strategies with risk and ambiguity budgets
  • Thrust II – Proactive strategies in adversarial environments: While Thrust I takes a passive approach by focusing on the synthesis of strategies that account for the limitations in prior knowledge and run-time information, Thrust II aims at proactively coping with these limitations at run time through learning and active sensing:
    • Task II.1 – Safety-constrained learning in adversarial domains
    • Task II.2 –Proactive sensing
    • Task I.3 – Strategies with partial and restricted randomization
  • Thrust III – Safeguarding against adversary’s adaptation and deception: Thrust III will help establish an understanding of cascading levels of reasoning between the patroller and the adversary. The methods developed under Thrust III will account for the effects of such cascading levels of reasoning in the decisions of the patroller. The proposed tasks are though also receptive to the unsurmountable complexity of directly modeling such mutual adaptation into a stochastic, partial-information game setting. We will rather pursue two indirect and complementary approaches:
    • Task III.1 – Suppressing the adversary’s ability to infer
    • Task III.2 – Discovering the adversary’s deceptive tactics