Workshop on Lie groups and symmetry
for Machine Learning and Deep Learning

Gaussian Processes

on Lie Groups

and their Homogeneous Spaces

Viacheslav (Slava) Borovitskiy

Iskander Azangulov

Currently: PhD @ Oxford

Andrei Smolensky

Currently: Asst. Prof @ Pafos Uni

Alexander Terenin

Currently: Asst. Prof @ Cornell

Gaussian Processes

Uncertainty-enabled models: input ⟶ prediction + uncertainty.

Gaussian processes (GPs) — gold standard.
— random functions with jointly Gaussian marginals.

Bayesian learning: prior GP + data = posterior GP

Prior GPs are determined by kernels (covariance functions)
— have to be positive semi-definite (PSD),
— not all kernels define "good" GPs.

This talk: defining priors / kernels.

Non-Euclidean Domains

Manifolds
e.g. in physics (robotics)

Graphs
e.g. for road networks

Non-Euclidean Domains

This talk:

Lie groups
e.g., $\mathbb{T}^n$, $\mathrm{SO}(n)$, $\mathrm{SU}(n)$

Homogeneous spaces
e.g., Spheres, Grassmann, Stiefel

Motivating Application: Bayesian Optimization in Robotics

Joint postures: $\mathbb{T}^d$

Rotations: $\mathbb{S}_3$, $\operatorname{SO}(3)$

Which GPs?

General-purpose GPs

On $\mathbb{R}^d$: Matérn Gaussian Processes

$$ \htmlData{class=fragment fade-out,fragment-index=6}{ \footnotesize \mathclap{ k_{\nu, \kappa, \sigma^2}(x,x') = \sigma^2 \frac{2^{1-\nu}}{\Gamma(\nu)} \del{\sqrt{2\nu} \frac{\norm{x-x'}}{\kappa}}^\nu K_\nu \del{\sqrt{2\nu} \frac{\norm{x-x'}}{\kappa}} } } \htmlData{class=fragment d-print-none,fragment-index=6}{ \footnotesize \mathclap{ k_{\infty, \kappa, \sigma^2}(x,x') = \sigma^2 \exp\del{-\frac{\norm{x-x'}^2}{2\kappa^2}} } } $$ $\sigma^2$: variance $\kappa$: length scale $\nu$: smoothness
$\nu\to\infty$: RBF kernel (Gaussian, Heat, Diffusion)

$\nu = 1/2$

$\nu = 3/2$

$\nu = 5/2$

$\nu = \infty$

Generalize the RBF Kernel?

$$ k_{\infty, \kappa, \sigma^2}(x,x') = \sigma^2\exp\del{-\frac{|x - x'|^2}{2\kappa^2}} $$

$$ k_{\infty, \kappa, \sigma^2}^{(d)}(x,x') = \sigma^2\exp\del{-\frac{d(x,x')^2}{2\kappa^2}} $$

Manifolds: not PSD for some $\kappa$ unless the manifold is isometric to $\mathbb{R}^d$.

Feragen et al. (CVPR 2015).

Symmetric spaces (incl. compact Lie groups): not PSD for all $\kappa$ unless isometric to $\mathbb{R}^d$.

Da Costa et al. (SIAM JMDS 2025).

Long story short...

Solution for Compact Riemannian Manifolds

The solution is a Gaussian process with kernel $$ \htmlData{fragment-index=2,class=fragment}{ k_{\nu, \kappa, \sigma^2}(x,x') = \frac{\sigma^2}{C_{\nu, \kappa}} \sum_{n=0}^\infty \Phi_{\nu, \kappa}(\lambda_n) f_n(x) f_n(x') } $$

$\lambda_n, f_n$: eigenvalues and eigenfunctions of the Laplace–Beltrami operator

$$ \htmlData{fragment-index=4,class=fragment}{ \Phi_{\nu, \kappa}(\lambda) = \begin{cases} \htmlData{fragment-index=5,class=fragment}{ \del{\frac{2\nu}{\kappa^2} - \lambda}^{-\nu-\frac{d}{2}} } & \htmlData{fragment-index=5,class=fragment}{ \nu < \infty \text{ --- Matérn} } \\ \htmlData{fragment-index=6,class=fragment}{ e^{-\frac{\kappa^2}{2} \lambda} } & \htmlData{fragment-index=6,class=fragment}{ \nu = \infty \text{ --- Heat (RBF)} } \end{cases} } $$

How to get $\lambda_n, f_n$?

Answer I: Discretize the Problem

Mesh the manifold, consider the discretized Laplace–Beltrami (a matrix).

Answer II: Use Algebraic Structure

Compact Lie groups and homogeneous spaces:

$\lambda_n, f_n$ are connected to representation theory of the group of symmetries.

A Homogeneous Space Example: Sphere $\mathbb{S}_n$

$$ \begin{aligned} \htmlData{fragment-index=1,class=fragment}{k(x,y)} &\htmlData{fragment-index=2,class=fragment}{ = \frac{\sigma^2}{C_{\nu, \kappa}} \sum_{n=0}^\infty \Phi_{\nu, \kappa}(\lambda_n) f_n(x) \mathrlap{f_n(y)} \htmlData{fragment-index=3,class=fragment}{ \obr{\phantom{f_n(y)}}{\hspace{-0.5cm}\text{spherical harmonics}\hspace{-0.5cm}} } } \\ &\htmlData{fragment-index=4,class=fragment}{= \frac{\sigma^2}{C_{\nu, \kappa}} \sum_{n=0}^\infty \Phi_{\nu, \kappa}(\lambda_n) \mathrlap{\del{\sum_{k=1}^{d_n} f_{n, k}(x) f_{n, k}(y)}} \htmlData{fragment-index=5,class=fragment}{ \ubr{\phantom{\del{\sum_{k=1}^{d_n} f_{n, k}(x) f_{n, k}(y)}}}{\hspace{-0.7cm} C_{n, d} \cdot \phi_{n, d}(x, y) \text{ --- zonal spherical harmonics}\hspace{-0.7cm}} } } \end{aligned} $$

The last equation

is much more efficient for computing $k$,
generalizes: zonal spherical harmonics → zonal ??????? ???????spherical functions,
unfortunately, is unfit for sampling. Or is it?

Examples: Samples from $\mathrm{GP}(0, k_{\infty})$ on Compact Homgeneous Spaces

Sphere $\mathbb{S}_2$

Torus $\mathbb{T}^2 = \bb{S}_1 \x \bb{S}_1$

Projective plane $\mathrm{RP}^2$

Thank you!

Slides:

viacheslav.borovitskiy@gmail.com https://vab.im @vabor112

Gaussian Processes

Gaussian process $f$ — random function with jointly Gaussian marginals.

Characterized by

a mean function $m(x) = \E(f(x))$,
a kernel (covariance) function $k(x, x') = \Cov(f(x), f(x'))$.

Notation: $f \~ \f{GP}(m, k)$.

The kernel $k$ must be positive (semi-)definite.

Gaussian Process Regression

Takes

data $(x_1, y_1), .., (x_n, y_n) \in X \x \R$,
and a prior Gaussian Process $\f{GP}(m, k)$

giving the posterior (conditional) Gaussian process $\f{GP}(\hat{m}, \hat{k})$.

The functions $\hat{m}$ and $\hat{k}$ may be explicitly expressed in terms of $m$ and $k$.

Motivating Example

Consider a Simple Dynamical System: a Pendulum

$$ \htmlData{fragment-index=0,class=fragment}{ x_0 } \qquad \htmlData{fragment-index=1,class=fragment}{ x_1 = x_0 + f(x_0)\Delta t } \qquad \htmlData{fragment-index=2,class=fragment}{ x_2 = x_1 + f(x_1)\Delta t } \qquad \htmlData{fragment-index=3,class=fragment}{ .. } $$

Consider a Simple Dynamical System: a Pendulum

$$ x_0 \qquad x_1 = x_0 + f(x_0)\Delta t \qquad x_2 = x_1 + f(x_1)\Delta t \qquad .. $$

Consider a Simple Dynamical System: a Pendulum

$$f: \R \x \R \to \R \x \R$$

assume $f$ unknown, model $f$ as a GP.

$$ x_0 \qquad x_1 = x_0 + f(x_0)\Delta t \qquad x_2 = x_1 + f(x_1)\Delta t \qquad .. $$

Pendulum Dynamics: Phase Portrait

Phase portrait is periodic!

I.e. the state space is actually a cylinder!

Examples: The Values of $k_{1/2}(\htmlStyle{color:rgb(255, 19, 0)!important}{\bullet},\.)$ on Compact Riemannian Manifolds

Circle
(Lie group)

Sphere
(homogeneous space)

Dragon
(general manifold)

Sampling: Generalized Random Phase Fourier Features

Zonal spherical harmonics satisfy reproducing property:

$$ \begin{aligned} \htmlData{class=fragment}{ \phi_{n, d}(x, y) } & \htmlData{class=fragment}{ = C_{n, d} \int_{\mathbb{S^n}} \phi_{n, d}(x, u) \phi_{n, d}(y, u) d u } \\ & \htmlData{class=fragment}{ \approx \frac{C_{n, d}}{L} \sum_{l=1}^L \phi_{n, d}(x, u_l) \phi_{n, d}(y, u_l), } && \htmlData{class=fragment}{ u_l \stackrel{\text{i.i.d.}}{\sim} \mathrm{U}(\mathbb{S}_n). } \end{aligned} $$

Hence $\sqrt{C_{n, d}/L} \cdot \phi_{n, d}(x, \v{u})$ forms an approximate feature transform.

This enables efficient sampling without knowing $f_n$. And this generalizes!

Non-compact symmetric spaces

Random Fourier Features (RFF)

Take stationary $k$. Assume for simplicity $k(x, x) = 1$. Then

$$ \htmlData{data-id=rffformula,class=fragment}{ \begin{aligned} k(x, x') &= \int_{\R^d} S(\lambda) e^{2 \pi i \innerprod{x - x'}{\lambda}} \d \lambda \\ & \htmlData{class=fragment}{ \approx \frac{1}{L} \sum_{l=1}^L e^{2 \pi i \innerprod{x - x'}{\lambda_l}} \qquad \lambda_l \sim S(\lambda) } \end{aligned} } $$

$k$ is RBF $\implies$ $S(\lambda)$ is Gaussian. $k$ is Matérn $\implies$ $S(\lambda)$ is $t$ distributed.

Random Fourier Features (RFF) for Symmetric Spaces

$$ \htmlData{data-id=rffformula}{ \begin{aligned} k(x, x') &= \htmlData{fragment-index=1,class=fragment fade-out disappearing-fragment}{ \int_{\R^d} } \htmlData{fragment-index=1,class=fragment fade-in appearing-fragment}{ \int_{\R^{\color{blue}r}} } S(\lambda) \htmlData{fragment-index=2,class=fragment fade-out disappearing-fragment}{ e^{2 \pi i \innerprod{x - x'}{\lambda}} } \htmlData{fragment-index=2,class=fragment fade-in appearing-fragment}{ {\color{blue}\pi^{(\lambda)}(x, x')} } \htmlData{fragment-index=3,class=fragment fade-in appearing-fragment}{ {\color{blue}c(\lambda)^{-2}} } \d \lambda \\ &\approx \frac{1}{L} \sum_{l=1}^L \htmlData{fragment-index=4,class=fragment fade-out disappearing-fragment}{ e^{2 \pi i \innerprod{x - x'}{\lambda_l}} } \htmlData{fragment-index=4,class=fragment fade-in appearing-fragment}{ {\color{blue}\pi^{(\lambda_l)}(x, x')} } \qquad \lambda_l \sim \htmlData{fragment-index=5,class=fragment fade-in appearing-fragment}{ {\color{blue}c(\lambda)^{-2}} } S(\lambda) \end{aligned} } $$

$k$ is RBF $\implies$ $S(\lambda)$ is Gaussian. $k$ is Matérn $\implies$ $S(\lambda)$ is $t$ distributed.

• $\pi^{(\lambda)}$ is an explicit integral. • $c(\lambda)$ has closed form.
• $c(\lambda)^{-2} S(\lambda)$ is a non-standard, potentially unnormalized, density.

Examples: The Values of $k_{\infty}(\htmlStyle{color:rgb(0, 0, 0)!important}{\bullet},\.)$ on Non-compact Manifolds

Hyperbolic space $\bb{H}_2$

Space of positive definite matrices $\f{SPD}(2)$

Random Fourier Features for Sampling

$$ \htmlData{class=fragment}{ k(x, x') \approx \frac{1}{L} \sum_{l=1}^L e^{2 \pi i \innerprod{x - x'}{\lambda_l}} \qquad \lambda_l \sim S(\lambda) } $$

Since $e^{2 \pi i \innerprod{x - x'}{\lambda_l}} = e^{2 \pi i \innerprod{x}{\lambda_l}} \overline{e^{2 \pi i \innerprod{x'}{\lambda_l}}}$, the above is an inner product.

$$ \htmlData{class=fragment}{ f(x) \approx \frac{1}{\sqrt{L}} \sum_{l=1}^L w_l e^{2 \pi i \innerprod{x}{\lambda_l}} \qquad w_l \sim \mathrm{N}(0, 1) \qquad \lambda_l \sim S(\lambda) } $$

Random Fourier Features for Sampling: symmetric spaces

$$ k(x, x') \approx \frac{1}{L} \sum_{l=1}^L \htmlData{fragment-index=1,class=fragment fade-out disappearing-fragment}{ e^{2 \pi i \innerprod{x - x'}{\lambda_l}} } \htmlData{fragment-index=1,class=fragment fade-in appearing-fragment}{ {\color{blue}\pi^{(\lambda_l)}(x, x')} } \qquad \htmlData{fragment-index=2,class=fragment fade-in appearing-fragment}{ {\color{blue}c(\lambda)^{-2}} } \lambda_l \sim S(\lambda) $$

Since $ \htmlData{fragment-index=3,class=fragment fade-out disappearing-fragment}{ e^{2 \pi i \innerprod{x - x'}{\lambda_l}} } \htmlData{fragment-index=3,class=fragment fade-in appearing-fragment}{ {\color{blue}\pi^{(\lambda_l)}(x, x')} } \!\,= \htmlData{fragment-index=4,class=fragment fade-out disappearing-fragment}{ e^{2 \pi i \innerprod{x}{\lambda_l}} \overline{e^{2 \pi i \innerprod{x'}{\lambda_l}}} } \htmlData{fragment-index=4,class=fragment fade-in appearing-fragment}{ {\color{blue} \pi^{(\lambda_l)}(x, ?) \overline{\pi^{(\lambda_l)}(x', ?)} } } $, the above is an inner product. $ \vphantom{ e^{2 \pi i \innerprod{x - x'}{\lambda_l}} {\color{blue}\pi^{(\lambda_l)}(x, x')} \!\,= e^{2 \pi i \innerprod{x}{\lambda_l}} \overline{e^{2 \pi i \innerprod{x'}{\lambda_l}}} {\color{blue} \pi^{(\lambda_l)}(x, ?) \overline{\pi^{(\lambda_l)}(x', ?)} } } \sout{ {\color{blue}\pi^{(\lambda_l)}(x, x')} = {\color{blue} \pi^{(\lambda_l)}(x, ?) \overline{\pi^{(\lambda_l)}(x', ?)} } } $, $ \vphantom{ e^{2 \pi i \innerprod{x - x'}{\lambda_l}} {\color{blue}\pi^{(\lambda_l)}(x, x')} \!\,= e^{2 \pi i \innerprod{x}{\lambda_l}} \overline{e^{2 \pi i \innerprod{x'}{\lambda_l}}} {\color{blue} \pi^{(\lambda_l)}(x, ?) \overline{\pi^{(\lambda_l)}(x', ?)} } } {\color{blue}\pi^{(\lambda_l)}(x, x')} = \E_{h \sim \mu_H} e^{\innerprod{i \lambda + \rho}{\,a(h, x)}} \overline{ e^{\innerprod{i \lambda + \rho}{\,a(h, x')}} }, $ where

$$ f(x) \approx \frac{1}{\sqrt{L}} \sum_{l=1}^L w_l e^{2 \pi i \innerprod{x}{\lambda_l}} \qquad w_l \sim \mathrm{N}(0, 1) \qquad \lambda_l \sim S(\lambda) $$

• the vector $\rho$ and the function $a(\cdot, \cdot)$ are known, • $\mu_H$ is samplabale.
• Monte Carlo approximation of the expectation $\E_{h \sim \mu_H}$ is an inner product.

Examples: Samples from $\mathrm{GP}(0, k_{\infty})$ on Non-compact Homgeneous Spaces

Hyperbolic space $\bb{H}_2$

Space of positive definite matrices $\f{SPD}(2)$

Workshop on Lie groups and symmetry for Machine Learning and Deep Learning

Gaussian Processes

on Lie Groups

and their Homogeneous Spaces

Viacheslav (Slava) Borovitskiy

Iskander Azangulov

Currently: PhD @ Oxford

Andrei Smolensky

Currently: Asst. Prof @ Pafos Uni

Alexander Terenin

Currently: Asst. Prof @ Cornell

Gaussian Processes

Non-Euclidean Domains

Manifolds e.g. in physics (robotics)

Graphs e.g. for road networks

Non-Euclidean Domains

Lie groups e.g., $\mathbb{T}^n$, $\mathrm{SO}(n)$, $\mathrm{SU}(n)$

Homogeneous spaces e.g., Spheres, Grassmann, Stiefel

Motivating Application: Bayesian Optimization in Robotics

Joint postures: $\mathbb{T}^d$

Rotations: $\mathbb{S}_3$, $\operatorname{SO}(3)$

Which GPs?

General-purpose GPs

On $\mathbb{R}^d$: Matérn Gaussian Processes

Generalize the RBF Kernel?

Long story short...

Solution for Compact Riemannian Manifolds

$\lambda_n, f_n$: eigenvalues and eigenfunctions of the Laplace–Beltrami operator

How to get $\lambda_n, f_n$?

Answer I: Discretize the Problem

Answer II: Use Algebraic Structure

A Homogeneous Space Example: Sphere $\mathbb{S}_n$

Examples: Samples from $\mathrm{GP}(0, k_{\infty})$ on Compact Homgeneous Spaces

Thank you!

Thank you!

Gaussian Processes

Gaussian Process Regression

Motivating Example

Consider a Simple Dynamical System: a Pendulum

Consider a Simple Dynamical System: a Pendulum

Consider a Simple Dynamical System: a Pendulum

Pendulum Dynamics: Phase Portrait

Pendulum Dynamics: Phase Portrait

Pendulum Dynamics: Phase Portrait

Pendulum Dynamics: Phase Portrait

Examples: The Values of $k_{1/2}(\htmlStyle{color:rgb(255, 19, 0)!important}{\bullet},\.)$ on Compact Riemannian Manifolds

Sampling: Generalized Random Phase Fourier Features

Non-compact symmetric spaces

Random Fourier Features (RFF)

Random Fourier Features (RFF) for Symmetric Spaces

Examples: The Values of $k_{\infty}(\htmlStyle{color:rgb(0, 0, 0)!important}{\bullet},\.)$ on Non-compact Manifolds

Random Fourier Features for Sampling

Random Fourier Features for Sampling: symmetric spaces

Examples: Samples from $\mathrm{GP}(0, k_{\infty})$ on Non-compact Homgeneous Spaces

Workshop on Lie groups and symmetry
for Machine Learning and Deep Learning

Manifolds
e.g. in physics (robotics)

Graphs
e.g. for road networks

Lie groups
e.g., $\mathbb{T}^n$, $\mathrm{SO}(n)$, $\mathrm{SU}(n)$

Homogeneous spaces
e.g., Spheres, Grassmann, Stiefel