← Back

Beyond Hard Constraints: Budget-Conditioned Reachability For Safe Offline Reinforcement Learning

Janaka Chathuranga Brahmanage, Akshat Kumar
📄 Paper (Extended Version)</> Code
BCRL Method Overview

Overview

Real-world applications of Reinforcement Learning (RL) must balance reward maximization with safety constraints. While safety reachability analysis is a promising alternative to unstable min–max optimization, most reachability-based methods address only hard safety constraints rather than cumulative cost constraints.

We propose Budget-Conditioned Reachability RL (BCRL), a novel offline safe RL algorithm that learns a safe policy from a fixed dataset without environment interaction by using dynamic budgets to enforce safety constraints without adversarial optimization.

Main Contributions