\( \newcommand{\TODO}[1]{{\color{red}TODO: {#1}}} \renewcommand{\vec}[1]{\mathbf{#1}} \newcommand{\state}{\vec{x}} \def\statet{\state_t} \def\statetp{\state_{t-1}} \def\statehist{\state_{1:t-1}} \def\statetn{\state_{t+1}} \def\obs{\meas} \def\obst{\obs_t} \def\act{a} \def\actt{\act_t} \def\acttp{\act_{t-1}} \def\acttn{\act_{t+1}} \def\Obs{\mathcal{O}} \def\ObsEnc{\Phi_o} \def\ObsProb{P_o} \def\ObsFunc{C} \def\ObsFuncFull{\ObsFunc(\statet, \actt) \rightarrow \obst} \def\ObsFuncInv{\ObsFunc^{-1}} \def\ObsFuncInvFull{\ObsFuncInv(\obst, \statetp, \actt) \rightarrow \statet} \def\StateSp{\mathcal{X}} \def\Action{\mathcal{A}} \def\TransP{P_{T}} \def\Trans{T} \def\TransFull{\Trans(\statet, \actt) \rightarrow \statetn} \def\TransObs{T_c} \def\Rew{R} \def\rew{r} \def\rewards{\vec{r}_{1:t}} \def\rewt{\rew_t} \def\rewtp{\rew_{t-1}} \def\rewtn{\rew_{t+1}} \def\RewFull{\Rew(\statet, \actt) \rightarrow \rewtn} \def\TransObsFull{\TransObs(\statet, \obst, \actt, \rewt; \theta_T) \rightarrow \statetn} \def\Value{V} \def\pit{\pi_t} \def\piDef{\pi(\acttn|\statet, \obst, \actt, \rewt; \theta_\pi) \rightarrow \pit(\acttn ; \theta_\pi)} \def\Valuet{\Value_t} \def\ValueDef{\Value(\statet, \obst, \actt, \rewt; \theta_\Value) \rightarrow \Valuet(\theta_\Value)} \def\R{\mathbb{R}} \def\E{\mathbb{E}} \newcommand{\Goal}{\mathcal{G}} \newcommand{\goalRV}{G} \newcommand{\meas}{z} \newcommand{\measurements}{\vec{\meas}_{1:t}} \newcommand{\meast}[1][t]{\meas_{#1}} \newcommand{\param}{\theta} \newcommand{\policy}{\pi} \newcommand{\graph}{G} \newcommand{\vtces}{V} \newcommand{\edges}{E} \newcommand{\st}{\state} \newcommand{\stn}{\st_{t+1}} \newcommand{\stt}{\st_t} \newcommand{\stk}{\st_k} \newcommand{\stj}{\st_j} \newcommand{\sti}{\st_i} \newcommand{\St}{\mathcal{S}} \newcommand{\Act}{\mathcal{A}} \newcommand{\acti}{\act_i} \newcommand{\lpt}{\delta} \newcommand{\trans}{P_T} \newcommand{\Q}{\qValue} \newcommand{\fwcost}{Q} \newcommand{\fw}{\fwcost} \newcommand{\qValue}{Q} \newcommand{\prew}{\Upsilon} \newcommand{\epiT}{T} \newcommand{\vma}{\alpha_\Value} \newcommand{\qma}{\alpha_\qValue} \newcommand{\prewma}{\alpha_\prew} \newcommand{\fwma}{\alpha_\fwcost} \newcommand{\maxValueBeam}{\vec{\state}_{\Value:\text{max}(m)}} \newcommand{\nil}{\emptyset} \newcommand{\discount}{\gamma} \newcommand{\minedgecost}{\fwcost_0} \newcommand{\goal}{g} \newcommand{\pos}{x} %\newcommand{\fwargs}[5]{\fw_{#4}^{#5}\left({#3}\middle|{#1}, {#2}\right)} \newcommand{\fwargs}[5]{\fw_{#4}^{#5}\left({#1}, {#2}, {#3}\right)} \newcommand{\Rgoal}{R_{\text{goal}}} \newcommand{\Loo}{Latency-1:\textgreater1} \newcommand{\Loss}{\mathcal{L}} \newcommand{\LossText}[1]{\Loss_{\text{#1}}} \newcommand{\LossDDPG}{\LossText{ddpg}} \newcommand{\LossStep}{\LossText{step}} \newcommand{\LossLo}{\LossText{lo}} \newcommand{\LossUp}{\LossText{up}} \newcommand{\LossTrieq}{\LossText{trieq}} \newcommand{\tgt}{\text{tgt}} \newcommand{\Qstar}{\Q_{*}} \newcommand{\Qtgt}{\Q_{\text{tgt}}} \newcommand{\ytgt}{y_t} % Symbols \newcommand{\ctrl}{\vec{u}} \newcommand{\Ctrl}{\mathcal{U}} \newcommand{\Data}{\mathcal{D}} \newcommand{\stdt}{\dot{\state}} \newcommand{\StDt}{\dot{\StateSp}} \newcommand{\dynSt}{f} \newcommand{\dynCt}{g} \newcommand{\bDynSt}{\bar{\dynSt}} \newcommand{\bDynCt}{\bar{\dynCt}} \newcommand{\dynAff}{F} \newcommand{\bDynAff}{\bar{\dynAff}} \newcommand{\ctrlaff}{\underline{\mathbf{\ctrl}}} \newcommand{\smallbmat}[1]{\left[\begin{smallmatrix}#1\end{smallmatrix}\right]} \newcommand{\Knl}{K} \newcommand{\knl}{\kappa} \newcommand{\bKx}{k_\state} \newcommand{\bKF}{k_\dynAff} \newcommand{\bKFu}{k_{\dynAff\ctrl}} \newcommand{\bKFx}{k_{\dynAff\state}} \newcommand{\bKFux}{k_{\dynAff\ctrl\state}} \newcommand{\covf}{\text{cov}} \newcommand{\dt}{\delta t} \newcommand{\dSt}{\stdt} \newcommand{\N}{\mathcal{N}} \newcommand{\StDat}{\mathbf{X}} \newcommand{\StDtDat}{\dot{\mathbf{X}}} \newcommand{\CtDat}{\underline{\boldsymbol{\mathcal{U}}}_{1:k}} \newcommand{\mat}[1]{{#1}} \newcommand{\Y}{\mat{Y}} \newcommand{\bY}{\bar{\Y}} \newcommand{\W}{\mat{W}} \newcommand{\V}{\mat{V}} \newcommand{\mH}{\mat{H}} \newcommand{\KH}{\Knl^\mH} \newcommand{\kH}{\knl^\mH} \newcommand{\GP}{\mathcal{GP}} \newcommand{\kDA}{\knl^\dynAff} \newcommand{\KDA}{\Knl^\dynAff} %\newcommand{\M}{\mathcal{M}} \newcommand{\kh}{\knl^{\dynAff\ctrlaff}} \newcommand{\KDat}{\mathfrak{K}} \newcommand{\kDat}{\bm{\knl}} \newcommand{\KhDat}{\KDat^{\dynAff\ctrlaff}} \newcommand{\khDADat}{\kDat^{\dynAff\ctrlaff\dynAff}} \newcommand{\khDA}{\knl^{\dynAff\ctrlaff\dynAff}} \newcommand{\dynAffDat}{\mathbf{\dynAff}} \newcommand{\grad}{\nabla} \newcommand{\Lie}{\mathcal{L}} \newcommand{\tdf}{\tilde{f}} \newcommand{\tdg}{\tilde{g}} \newcommand{\barf}{\bar{f}} \newcommand{\barg}{\bar{g}} \newcommand{\erf}{\textit{erf}} \newcommand{\etal}{et~al.} \newcommand{\CBC}{\mbox{CBC}} \newcommand{\CBCtwo}{\CBC^{(2)}} \newcommand{\Prob}{\mathbb{P}} \newcommand{\tdbff}{\bff^*_k} \newcommand{\mDynAffs}{\bfM_k} \newcommand{\bfBs}{\bfB_k} \DeclareMathOperator{\vect}{\textit{vec}} \DeclareMathOperator{\diag}{\mathbf{diag}} \DeclareMathOperator{\cov}{cov} \DeclareMathOperator{\Cov}{\mathbf{Cov}} \DeclareMathOperator{\Var}{Var} % Calligraphic fonts \newcommand{\calA}{{\cal A}} \newcommand{\calB}{{\cal B}} \newcommand{\calC}{{\cal C}} \newcommand{\calD}{{\cal D}} \newcommand{\calE}{{\cal E}} \newcommand{\calF}{{\cal F}} \newcommand{\calG}{{\cal G}} \newcommand{\calH}{{\cal H}} \newcommand{\calI}{{\cal I}} \newcommand{\calJ}{{\cal J}} \newcommand{\calK}{{\cal K}} \newcommand{\calL}{{\cal L}} \newcommand{\calM}{{\cal M}} \newcommand{\calN}{{\cal N}} \newcommand{\calO}{{\cal O}} \newcommand{\calP}{{\cal P}} \newcommand{\calQ}{{\cal Q}} \newcommand{\calR}{{\cal R}} \newcommand{\calS}{{\cal S}} \newcommand{\calT}{{\cal T}} \newcommand{\calU}{{\cal U}} \newcommand{\calV}{{\cal V}} \newcommand{\calW}{{\cal W}} \newcommand{\calX}{{\cal X}} \newcommand{\calY}{{\cal Y}} \newcommand{\calZ}{{\cal Z}} % Sets: \newcommand{\setA}{\textsf{A}} \newcommand{\setB}{\textsf{B}} \newcommand{\setC}{\textsf{C}} \newcommand{\setD}{\textsf{D}} \newcommand{\setE}{\textsf{E}} \newcommand{\setF}{\textsf{F}} \newcommand{\setG}{\textsf{G}} \newcommand{\setH}{\textsf{H}} \newcommand{\setI}{\textsf{I}} \newcommand{\setJ}{\textsf{J}} \newcommand{\setK}{\textsf{K}} \newcommand{\setL}{\textsf{L}} \newcommand{\setM}{\textsf{M}} \newcommand{\setN}{\textsf{N}} \newcommand{\setO}{\textsf{O}} \newcommand{\setP}{\textsf{P}} \newcommand{\setQ}{\textsf{Q}} \newcommand{\setR}{\textsf{R}} \newcommand{\setS}{\textsf{S}} \newcommand{\setT}{\textsf{T}} \newcommand{\setU}{\textsf{U}} \newcommand{\setV}{\textsf{V}} \newcommand{\setW}{\textsf{W}} \newcommand{\setX}{\textsf{X}} \newcommand{\setY}{\textsf{Y}} \newcommand{\setZ}{\textsf{Z}} % Vectors \newcommand{\bfa}{\mathbf{a}} \newcommand{\bfb}{\mathbf{b}} \newcommand{\bfc}{\mathbf{c}} \newcommand{\bfd}{\mathbf{d}} \newcommand{\bfe}{\mathbf{e}} \newcommand{\bff}{\mathbf{f}} \newcommand{\bfg}{\mathbf{g}} \newcommand{\bfh}{\mathbf{h}} \newcommand{\bfi}{\mathbf{i}} \newcommand{\bfj}{\mathbf{j}} \newcommand{\bfk}{\mathbf{k}} \newcommand{\bfl}{\mathbf{l}} \newcommand{\bfm}{\mathbf{m}} \newcommand{\bfn}{\mathbf{n}} \newcommand{\bfo}{\mathbf{o}} \newcommand{\bfp}{\mathbf{p}} \newcommand{\bfq}{\mathbf{q}} \newcommand{\bfr}{\mathbf{r}} \newcommand{\bfs}{\mathbf{s}} \newcommand{\bft}{\mathbf{t}} \newcommand{\bfu}{\mathbf{u}} \newcommand{\bfv}{\mathbf{v}} \newcommand{\bfw}{\mathbf{w}} \newcommand{\bfx}{\mathbf{x}} \newcommand{\bfy}{\mathbf{y}} \newcommand{\bfz}{\mathbf{z}} \newcommand{\bfalpha}{\boldsymbol{\alpha}} \newcommand{\bfbeta}{\boldsymbol{\beta}} \newcommand{\bfgamma}{\boldsymbol{\gamma}} \newcommand{\bfdelta}{\boldsymbol{\delta}} \newcommand{\bfepsilon}{\boldsymbol{\epsilon}} \newcommand{\bfzeta}{\boldsymbol{\zeta}} \newcommand{\bfeta}{\boldsymbol{\eta}} \newcommand{\bftheta}{\boldsymbol{\theta}} \newcommand{\bfiota}{\boldsymbol{\iota}} \newcommand{\bfkappa}{\boldsymbol{\kappa}} \newcommand{\bflambda}{\boldsymbol{\lambda}} \newcommand{\bfmu}{\boldsymbol{\mu}} \newcommand{\bfnu}{\boldsymbol{\nu}} \newcommand{\bfomicron}{\boldsymbol{\omicron}} \newcommand{\bfpi}{\boldsymbol{\pi}} \newcommand{\bfrho}{\boldsymbol{\rho}} \newcommand{\bfsigma}{\boldsymbol{\sigma}} \newcommand{\bftau}{\boldsymbol{\tau}} \newcommand{\bfupsilon}{\boldsymbol{\upsilon}} \newcommand{\bfphi}{\boldsymbol{\phi}} \newcommand{\bfchi}{\boldsymbol{\chi}} \newcommand{\bfpsi}{\boldsymbol{\psi}} \newcommand{\bfomega}{\boldsymbol{\omega}} \newcommand{\bfxi}{\boldsymbol{\xi}} \newcommand{\bfell}{\boldsymbol{\ell}} % Matrices \newcommand{\bfA}{\mathbf{A}} \newcommand{\bfB}{\mathbf{B}} \newcommand{\bfC}{\mathbf{C}} \newcommand{\bfD}{\mathbf{D}} \newcommand{\bfE}{\mathbf{E}} \newcommand{\bfF}{\mathbf{F}} \newcommand{\bfG}{\mathbf{G}} \newcommand{\bfH}{\mathbf{H}} \newcommand{\bfI}{\mathbf{I}} \newcommand{\bfJ}{\mathbf{J}} \newcommand{\bfK}{\mathbf{K}} \newcommand{\bfL}{\mathbf{L}} \newcommand{\bfM}{\mathbf{M}} \newcommand{\bfN}{\mathbf{N}} \newcommand{\bfO}{\mathbf{O}} \newcommand{\bfP}{\mathbf{P}} \newcommand{\bfQ}{\mathbf{Q}} \newcommand{\bfR}{\mathbf{R}} \newcommand{\bfS}{\mathbf{S}} \newcommand{\bfT}{\mathbf{T}} \newcommand{\bfU}{\mathbf{U}} \newcommand{\bfV}{\mathbf{V}} \newcommand{\bfW}{\mathbf{W}} \newcommand{\bfX}{\mathbf{X}} \newcommand{\bfY}{\mathbf{Y}} \newcommand{\bfZ}{\mathbf{Z}} \newcommand{\bfGamma}{\boldsymbol{\Gamma}} \newcommand{\bfDelta}{\boldsymbol{\Delta}} \newcommand{\bfTheta}{\boldsymbol{\Theta}} \newcommand{\bfLambda}{\boldsymbol{\Lambda}} \newcommand{\bfPi}{\boldsymbol{\Pi}} \newcommand{\bfSigma}{\boldsymbol{\Sigma}} \newcommand{\bfUpsilon}{\boldsymbol{\Upsilon}} \newcommand{\bfPhi}{\boldsymbol{\Phi}} \newcommand{\bfPsi}{\boldsymbol{\Psi}} \newcommand{\bfOmega}{\boldsymbol{\Omega}} % Blackboard Bold: \newcommand{\bbA}{\mathbb{A}} \newcommand{\bbB}{\mathbb{B}} \newcommand{\bbC}{\mathbb{C}} \newcommand{\bbD}{\mathbb{D}} \newcommand{\bbE}{\mathbb{E}} \newcommand{\bbF}{\mathbb{F}} \newcommand{\bbG}{\mathbb{G}} \newcommand{\bbH}{\mathbb{H}} \newcommand{\bbI}{\mathbb{I}} \newcommand{\bbJ}{\mathbb{J}} \newcommand{\bbK}{\mathbb{K}} \newcommand{\bbL}{\mathbb{L}} \newcommand{\bbM}{\mathbb{M}} \newcommand{\bbN}{\mathbb{N}} \newcommand{\bbO}{\mathbb{O}} \newcommand{\bbP}{\mathbb{P}} \newcommand{\bbQ}{\mathbb{Q}} \newcommand{\bbR}{\mathbb{R}} \newcommand{\bbS}{\mathbb{S}} \newcommand{\bbT}{\mathbb{T}} \newcommand{\bbU}{\mathbb{U}} \newcommand{\bbV}{\mathbb{V}} \newcommand{\bbW}{\mathbb{W}} \newcommand{\bbX}{\mathbb{X}} \newcommand{\bbY}{\mathbb{Y}} \newcommand{\bbZ}{\mathbb{Z}} \newcommand{\CBCr}{\mbox{CBC}^{(r)}} \) \( \newenvironment{proof}{\paragraph{Proof:}}{\hfill$\square$} %\newtheorem{theorem}{Theorem} %\theoremstyle{remark} %\newtheorem{lemma}{Lemma} %\newtheorem{remark}{Remark} %\theoremstyle{definition} \newtheorem{defn}{Definition} %\theoremstyle{definition} \newtheorem{exmp}{Example} \newtheorem{conj}{Conjecture} %\newtheorem{corollary}{Corollary} \newtheorem{Proposition}{Proposition} \newtheorem{ansatz}{Assumption} \newtheorem{problem}{Problem} \newcommand{\oprocendsymbol}{\hbox{$\bullet$}} \newcommand{\oprocend}{\relax\ifmmode\else\unskip\hfill\fi\oprocendsymbol} \def\eqoprocend{\tag*{$\bullet$}} \newcommand{\blue}[1]{\color{blue}{#1}} %% math functions \newcommand{\modulo}{\text{mod}} %% symbols \newcommand{\real}{\mathbb{R}} \newcommand{\integers}{\mathbb{N}} \newcommand{\complex}{\mathbb{C}} \DeclareMathOperator*{\argmax}{arg\,max} \DeclareMathOperator*{\argmin}{arg\,min} \DeclareMathOperator*{\softmax}{softmax} \DeclareMathOperator*{\Tr}{Tr} \DeclareMathOperator*{\RE}{Re} \DeclareMathOperator*{\IM}{Im} \DeclareMathOperator{\tr}{\mathbf{tr}} \newcommand{\floor}[1]{\lfloor #1 \rfloor} \newcommand{\ceil}[1]{\lceil #1 \rceil} \newcommand{\scaleMathLine}[2][1]{\resizebox{#1\linewidth}{!}{$\displaystyle{#2}$}} \)

Probabilistic Safety Constraints

for Learned High Relative Degree System Dynamics

Mohammad Javad Khojasteh\(^*\)
Vikas Dhiman\(^*\)
Massimo Franceschetti
Nikolay Atanasov \(^*\) These authors contributed equally.

Problem formulation

  • \begin{align} \label{eq:system_dyanmics} \dot{\bfx} = f(\bfx) + g(\bfx)\bfu = \begin{bmatrix} f(\bfx) & g(\bfx)\end{bmatrix} \begin{bmatrix}1\\\bfu\end{bmatrix} =: F(\bfx) \ctrlaff \end{align}
  • \[ \vect(F(\bfx)) \sim \GP(\vect(\bfM_0(.)), \bfK_0(.,.)) \]
  • \begin{align} \min_{\bfu_k \in \mathcal{U}}& \text{ Task cost function } \\ \qquad\text{s.t.}&~~\bbP\bigl( \text{ Safety constraint } \mid \bfx_k,\bfu_k \bigr) \ge \tilde{p}_k, \end{align}
    \begin{align} \min_{\bfu_k \in \mathcal{U}}& \|\bfu_k - \cssId{highlight-border-red-1}{\class{fragment}{\pi_\epsilon(\bfx_k)}} \|_Q \\ \qquad\text{s.t.}&~~\bbP\bigl( \cssId{highlight-border-red-1}{\class{fragment}{h(\bfx) > \zeta_h > 0}} \mid \bfx_k,\bfu_k \bigr) \ge \tilde{p}_k, \end{align}

Approach

  • Estimate \(F(\bfx)\) with uncertainity.
  • Propagate uncertainty to the Safety condition.
  • Extension to continous time using Lipchitz continuity assumptions.
  • Extension to higher relative degree systems.

Matrix Variate Gaussian Processes

\[ \vect(F(\bfx)) \sim \GP(\vect(\bfM_0(.)), \bfK_0(.,.)) \]
Option 1: Alvarez et al (FTML 2012): \[ \bfK_0(\bfx, \bfx') = \kappa(\bfx, \bfx') \boldsymbol{\Sigma} \]
Option 2: Sun et al (AISTATS 2017)
\[ F \sim \mathcal{MVG}(\bfM, \bfA, \bfB) \Leftrightarrow \vect(F) \sim \calN(\vect(M), \bfB \otimes \bfA) \]
\[ \bfK_0(\bfx, \bfx') = \bfB_0(\bfx, \bfx') \otimes \bfA \]

Factorization assumption: \[ \vect(F(\bfx)) \sim \GP(\vect(\bfM_0(.)), \bfB_0(.,.) \otimes \bfA) \]

Matrix variate Gaussian Process

\( \newcommand{\prl}[1]{\left(#1\right)} \newcommand{\brl}[1]{\left[#1\right]} \newcommand{\crl}[1]{\left\{#1\right\}} \) \begin{equation} \begin{aligned} \vect(F(\bfx)) &\sim \mathcal{GP}(\vect(\bfM_0(\bfx)), \bfB_0(\bfx,\bfx') \otimes \bfA) %F(\bfx)\underline{\bfu} &\sim \mathcal{GP}(\bfM_0(\bfx)\underline{\bfu}, \underline{\bfu}^\top \bfB_0(\bfx,\bfx') \underline{\bfu}' \otimes \bfA) \end{aligned} \end{equation}
Given data \(\StDat_{1:k} := [\bfx(t_1), \dots, \bfx(t_k)]\), \(\StDtDat_{1:k}=[\dot{\bfx}(t_1), \dots, \dot{\bfx}(t_k)] \), and \( \underline{\boldsymbol{\mathcal{U}}}_{1:k}:= \diag(\ctrlaff_1, \dots, \ctrlaff_k) \).
\begin{equation*} \begin{aligned} \bfM_k(\bfx_*) &:= \bfM_0(\bfx_*) + \prl{ \dot{\bfX}_{1:k} - \boldsymbol{\mathcal{M}}_{1:k}\underline{\boldsymbol{\mathcal{U}}}_{1:k}} \prl{\underline{\boldsymbol{\mathcal{U}}}_{1:k}^\top\bfB_0(\bfX_{1:k},\bfX_{1:k})\underline{\boldsymbol{\mathcal{U}}}_{1:k}}^{-1}\underline{\boldsymbol{\mathcal{U}}}_{1:k}^\top\bfB_0(\bfX_{1:k},\bfx_*)\\ \bfB_k(\bfx_*,\bfx_*') &:= \bfB_0(\bfx_*,\bfx_*') + \bfB_0(\bfx_*,\bfX_{1:k})\underline{\boldsymbol{\mathcal{U}}}_{1:k}\prl{\underline{\boldsymbol{\mathcal{U}}}_{1:k}^\top\bfB_0(\bfX_{1:k},\bfX_{1:k})\underline{\boldsymbol{\mathcal{U}}}_{1:k}}^{-1}\underline{\boldsymbol{\mathcal{U}}}_{1:k}^\top\bfB_0(\bfX_{1:k},\bfx_*') \label{eq:mvg-posterior} \end{aligned} \end{equation*}
Inference on MVGP: \begin{align} \vect(F_k(\bfx_*)) &\sim \mathcal{GP}(\vect(\bfM_k(\bfx_*)), \; \bfB_k(\bfx_*,\bfx_*') \otimes \bfA). \\ F_k(\bfx_*)\underline{\bfu}_* &\sim \mathcal{GP}(\bfM_k(\bfx_*)\underline{\bfu}_*, \; \underline{\bfu}_*^\top\bfB_k(\bfx_*,\bfx_*')\underline{\bfu}_*\otimes\bfA). \end{align}

Approach

  • Estimate \(F(\bfx)\) with Matrix-Variate Gaussian Process
  • Propagate uncertainty to the Safety condition
  • Extension to continous time using Lipchitz continuity assumptions.
  • Extension to higher relative degree systems.

Control Barrier Functions

  • For differentiable \( h(\bfx) \),
    safe set is \( \calC = \{ \bfx \in \calX : h(\bfx) > 0 \} \)
  • Assume \( \grad_\bfx h(\bfx) \ne 0 \quad \forall x \in \partial \calC \)
  • Assume system starts in safe state \( \bfx(0) \in \calC \)
  • Ames et al (ECC 2019): \begin{multline} \text{ System stays safe } \Leftrightarrow~~\exists~\bfu = \pi(\bfx)~~\text{s.t.}\\ \mbox{CBC}(\bfx,\bfu) := \Lie_f h(\bfx) + \Lie_g h(\bfx)\bfu + \alpha(h(\bfx)) \ge 0 \;~ \forall \bfx \in \calX. \end{multline} where \( \alpha(y) \) is some extended class \( \calK_\infty \) function

Uncertainity propagation to CBC

  • \begin{align} \mbox{CBC}(\bfx, \bfu) &:= \Lie_{f}h(\bfx) + \Lie_{g}h(\bfx)\bfu + \alpha(h(\bfx)) \end{align}
  • \[ \mbox{CBC}(\bfx, \bfu)= \grad_\bfx h(\bfx)F_k(\bfx)\ctrlaff + \alpha(h(\bfx)) \]
  • Recall: \begin{equation} F_k(\bfx_*)\underline{\bfu}_* \sim \mathcal{GP}(\bfM_k(\bfx_*)\underline{\bfu}_*, \underline{\bfu}_*^\top\bfB_k(\bfx_*,\bfx_*')\underline{\bfu}_*\otimes\bfA). \end{equation}
  • Lemma : \[ \mbox{CBC}(\bfx, \bfu) \sim \GP(\E[\mbox{CBC}], \Var(\mbox{CBC})) \] \begin{align} \label{eq:parametofpi5543} \E[\mbox{CBC}_k](\bfx, \bfu) &= \nabla_\bfx h(\bfx)^\top \bfM_k(\bfx)\underline{\bfu} + \alpha(h(\bfx)),\\ \Var[\mbox{CBC}_k](\bfx, \bfx'; \bfu) &= \underline{\bfu}^\top\bfB_k(\bfx,\bfx')\underline{\bfu} \nabla_\bfx h(\bfx)^{\top}\bfA\nabla_\bfx h(\bfx') \end{align} Note: mean and variance are Affine and Quadratic in \( \bfu \) respectively.

Deterministic condition for controller

  • \begin{align} \min_{\bfu_k \in \mathcal{U}}& \text{ Task cost function } \\ \qquad\text{s.t.}&~~\bbP\bigl( \text{ Safety constraint } \mid \bfx_k,\bfu_k \bigr) \ge \tilde{p}_k, \end{align}
    \begin{align} \min_{\bfu_k \in \mathcal{U}}& \|\bfu_k - \pi_\epsilon(\bfx_k) \|_Q \\ \qquad\text{s.t.}&~~\bbP\bigl( \style{color:red}{\mbox{CBC}(\bfx_k, \bfu_k) > \zeta > 0} \mid \bfx_k,\bfu_k \bigr) \ge \tilde{p}_k, \end{align}
  • \[ \newcommand{\CBC}{\mbox{CBC}} \bbP\bigl(\mbox{CBC}(\bfx_k, \bfu_k) > \zeta \mid \bfx_k,\bfu_k \bigr) \ge \tilde{p}_k \\ \Leftrightarrow \frac{1}{2}-\frac{1}{2} \erf\left( \frac{\zeta - \E[\CBC] }{\sqrt{2\Var(\CBC)}} \right) \ge \tilde{p}_k \] where \( \erf(y) \) is there error function.
  • Safe controller (a QCQP): \begin{align} \min_{\bfu_k \in \mathcal{U}}& \|\bfu_k - \pi_\epsilon(\bfx_k) \|_Q \\ \qquad\text{s.t.}\qquad& \cssId{highlight-current-red-1}{\class{fragment}{ (\E[\CBC] - \zeta)^2 \ge 2\Var(\CBC)\erf^{-1}(1-2\tilde{p}_k)^2 }} \\ & \cssId{highlight-current-red-1}{\class{fragment}{ \E[\CBC] - \zeta \ge 0 }} \end{align}

Approach

  • Estimate \(F(\bfx)\) with Matrix-Variate Gaussian Process
  • Propagate uncertainty to the Control Barrier condition.
  • Extension to continous time using Lipchitz continuity assumptions.
  • Extension to higher relative degree systems.

Safety beyond triggering times

  • So far: \begin{align} \min_{\bfu_k \in \mathcal{U}}& \|\bfu_k - \pi_\epsilon(\bfx_k) \|_Q \\ \qquad\text{s.t.}&~~ \bbP\bigl( \mbox{CBC}(\style{color:red}{\bfx_k}, \bfu_k) > \style{color:red}{\zeta} \mid \bfx_k,\bfu_k \bigr) \ge \style{color:red}{\tilde{p}_k}, \end{align}
  • Next: \begin{align} \min_{\bfu_k \in \mathcal{U}}& \|\bfu_k - \pi_\epsilon(\bfx_k) \|_Q \\ \qquad\text{s.t.}&~~ \bbP\bigl( \mbox{CBC}(\style{color:red}{\bfx(t)}, \bfu_k) > \style{color:red}{0} \mid \bfx_k,\bfu_k \bigr) \ge \style{color:red}{p_k}, \qquad \style{color:red}{\forall t \in [t_k, \tau_k)} \end{align}

Safety beyond triggering times

  • Assume Lipchitz continuity of dynamics: \begin{align} \textstyle \label{eq:smoth23} \bbP\left( \sup_{s \in [0, \tau_k)}\|F(\bfx(t_k+s))\ctrlaff_k -F(\bfx(t_k))\ctrlaff_k\| \le L_k \|\bfx(t_k+s)-\bfx_k\| \right) \ge q_k:=1-e^{-b_kL_k}. \end{align}
  • Assume Lipchitz continuity of \( \alpha(h(\bfx)) \): \begin{align} \label{htym6!7uytf} |\alpha \circ h(\bfx(t_k+s))-\alpha \circ h(\bfx_k)| \le L_{\alpha \circ h} \|\bfx(t_k+s)-\bfx_k\|. \end{align}
Theorem: \[ \bbP\bigl( \mbox{CBC}(\bfx_k, \bfu_k) > \zeta \mid \bfx_k,\bfu_k \bigr) \ge \tilde{p}_k \quad\Rightarrow\quad \bbP\bigl( \mbox{CBC}(\bfx(t), \bfu_k) > 0 \mid \bfx_k,\bfu_k \bigr) \ge p_k, \; \forall t \in [t_k, \tau_k) \] holds with \( p_k = \tilde{p}_k q_k \) and \( \tau_k \le \frac{1}{L_k}\ln\left(1+\frac{L_k\zeta}{(\chi_kL_k+L_{\alpha \circ h})\|\dot{\bfx}_k\|}\right) \)

Approach

  • Estimate \(F(\bfx)\) with Matrix-Variate Gaussian Process
  • Propagate uncertainty to the Control Barrier condition.
  • Extension to continous time using Lipchitz continuity assumptions.
  • Extension to higher relative degree systems.

Higher relative degree CBFs

  • \begin{align} \begin{bmatrix} \dot{\theta} \\ \dot{\omega} \end{bmatrix} = \underbrace{\begin{bmatrix} \omega \\ -\frac{g}{l} \sin(\theta) \end{bmatrix}}_{f(\bfx)} + \underbrace{\begin{bmatrix} 0 \\ \frac{1}{ml} \end{bmatrix}}_{g(\bfx)} u \end{align}
  • \begin{align} h\left(\begin{bmatrix} \theta \\ \omega \end{bmatrix} \right) = \cos(\Delta_{col}) - \cos(\theta - \theta_c) \end{align}
  • Note that \( \Lie_g h(\bfx) = \grad h(\bfx) g(\bfx) = 0 \)
  • Thus \( \CBC(\bfx, \bfu) \) is independent of u.

Exponential Control Barrier Functions (ECBF)

  • \[ \CBCr(\bfx, \bfu) := \Lie_f^{(r)} h(\bfx) + \Lie_g \Lie_f^{(r-1)} h(\bfx) \bfu + K_\alpha \begin{bmatrix} h(\bfx) \\ \Lie_f h(\bfx) \\ \vdots \\ \Lie_f^{(r-1)} h(\bfx) \end{bmatrix} \]
  • \( r \ge 1 \) is the relative degree of CBF, \( h(\bfx) \), then \( \Lie_g \Lie_f^{k} h(\bfx) = 0, \; \forall k = \{0, \dots, r-2 \} \) and \( \Lie_g \Lie_f^{(r-1)} h(\bfx) \ne 0 \) and

Propagating uncertainity to \( \CBCtwo \)

  • \( \Lie_f h(\bfx) \) is a Gaussian process
  • \( \grad_\bfx \Lie_f h(\bfx) \) is a Gaussian process
  • \( [\grad_\bfx \Lie_f h(\bfx)]^\top F(\bfx)\ctrlaff \) is a quadratic form of GP
  • \[ \CBCtwo(\bfx, \bfu) = [\grad_\bfx \Lie_f h(\bfx)]^\top F(\bfx)\ctrlaff + K_\alpha \begin{bmatrix} h(\bfx) & \Lie_f h(\bfx) \end{bmatrix}^\top \]
  • \( \CBCtwo(\bfx, \bfu) \) is a quadratic form of GP.
    \( \E[\CBCtwo](\bfx, \bfu) \) is still affine in \( \bfu \).
    \( \Var[\CBCtwo](\bfx, \bfx'; \bfu) \) is still quadratic in \( \bfu \).
  • For \( r \ge 3 \), \(\CBCr\) statistics can be estimated by Monte-carlo methods.

Safe controller using ECBF

  • \begin{align} \min_{\bfu_k \in \mathcal{U}}& \|\bfu_k - \pi_\epsilon(\bfx_k) \|_Q \\ \qquad\text{s.t.}&~~ \bbP\bigl( \CBCr(\bfx_k, \bfu_k) > \zeta \mid \bfx_k,\bfu_k \bigr) \ge \tilde{p}_k \end{align}
  • Using Cantelli's (Chebyshev's one-sided) inequality
  • Safe controller (a QCQP) \begin{align} \min_{\bfu_k \in \mathcal{U}}& \|\bfu_k - \pi_\epsilon(\bfx_k) \|_Q \\ \qquad\text{s.t.}\qquad &(\E[\mbox{CBC}_k^{(r)}]-\zeta)^2 \ge \frac{\tilde{p}_k}{1-\tilde{p}_k}\Var[\mbox{CBC}_k^{(r)}] \\ &\E[\mbox{CBC}_k^{(r)}]-\zeta \ge 0 \end{align}

Learning Experiments

  • \begin{align} \begin{bmatrix} \dot{\theta} \\ \dot{\omega} \end{bmatrix} = \underbrace{\begin{bmatrix} \omega \\ -\frac{g}{l} \sin(\theta) \end{bmatrix}}_{f(\bfx)} + \underbrace{\begin{bmatrix} 0 \\ \frac{1}{ml} \end{bmatrix}}_{g(\bfx)} u \end{align}
  • \begin{align} h\left(\begin{bmatrix} \theta \\ \omega \end{bmatrix} \right) = \cos(\Delta_{col}) - \cos(\theta - \theta_c) \end{align}

Safe controller using ECBF Experiments

  • \begin{align} \begin{bmatrix} \dot{\theta} \\ \dot{\omega} \end{bmatrix} = \underbrace{\begin{bmatrix} \omega \\ -\frac{g}{l} \sin(\theta) \end{bmatrix}}_{f(\bfx)} + \underbrace{\begin{bmatrix} 0 \\ \frac{1}{ml} \end{bmatrix}}_{g(\bfx)} u \end{align}
  • \begin{align} h\left(\begin{bmatrix} \theta \\ \omega \end{bmatrix} \right) = \cos(\Delta_{col}) - \cos(\theta - \theta_c) \end{align}


Take away

  • Safety guarantees in stochastic control-affine systems were formuated as Quadratic contraints on the control signal using Exponential Control Barrier Functions.

Ongoing work

  • More experiments (closer to the Motivation).
  • Entropy objective to pick optimal actions for reducing uncertainity.
  • Application of Hansen-Wright like inequalities for tighter bounds on \( \CBCr \)

Bibliography

Thank you. Questions?

Paper URL: arxiv.org/abs/1912.10116

\(^*\) These authors contributed equally.