Safe reinforcement learning integrating physic laws, control theories, and formal methods

PI: Ding Zhao, Assistant Professor, Mechanical Engineering, College of Engineering

Co-PI: Conrad Tucker, Professor, Mechanical Engineering, College of Engineering; Eunsuk Kang, Assistant Professor, Institute for Software Research, CyLab

Innovations driven by recent progress in artificial intelligence (AI) have shown human-competitive performance in sensing, decision-making, and manipulation. However, as research expands to real-world cyber-physical applications, the question of AI safety is becoming a crux for the transition from theories to practice. In this project, we aim to develop a safe deep reinforcement learning (RL) scheme leveraging physic laws, control theories, and formal methods to tackle simulation-to-real challenges for RL applications. The CBI grant provides an opportunity to synthesize the existing work on safe AI by the three PIs and promote cross-cutting novel approaches.

The team will propose model-based RL based on Lyapunov control theories with time delay and non-stationary environment, engineering principle-based generated models and formal methods for provably safe AI. Specifically, we will explore a novel model-based method to expressively incorporate the classical theories. This work will leverage the power of physics-based simulation (e.g., using the Unity physics simulation environment) for model-based RL evaluation, which enables the exploration of safety scenarios that may be too costly or dangerous to be frequently tested in real-world environments. The PIs will then investigate how well the trained model deals with the real-world conditions, with two physical delivery robots developed in previous projects deployed to the CMU campus to deliver books for the library. By exploring the transferability of model-based simulation results to the real world, the PIs will gain a fundamental understanding of physical constraints and assumptions in the simulation that are challenging to map to real-world conditions.