Geometry Seminar, Department of Mathematics, ETH Zürich
Geometry-aware
Gaussian Processes
for Machine Learning
Viacheslav Borovitskiy (Slava)

Definition. A Gaussian process is random function f on a set X such that for any x1,..,xn∈X, the vector f(x1),..,f(xn) is multivariate Gaussian.
The distribution of a Gaussian process is characterized by
Notation: f∼GP(m,k).
The kernel k must be positive (semi-)definite, i.e. for all x1,..,xn∈X
the matrix Kxx:={k(xi,xj)}1≤i≤n1≤j≤n must be positive (semi-)definite.
Takes
giving the posterior (conditional) Gaussian process GP(m^,k^).
The functions m^ and k^ may be explicitly expressed in terms of m and k.
Goal: minimize unknown function ϕ in as few evaluations as possible.
Also
kν,κ,σ2(x,x′)=σ2Γ(ν)21−ν(2νκ∥x−x′∥)νKν(2νκ∥x−x′∥)k∞,κ,σ2(x,x′)=σ2exp(−2κ2∥x−x′∥2)
σ2: variance
κ: length scale
ν: smoothness
ν→∞: Gaussian kernel (Heat, Diffusion, RBF)
ν=1/2
ν=3/2
ν=5/2
ν=∞
k∞,κ,σ2(dg)(x,x′)=σ2exp(−2κ2dg(x,x′)2)
Theorem. (Feragen et al.) Let M be a complete Riemannian manifold without boundary. If k∞,κ,σ2(dg) is positive semi-definite for all κ, then M is isometric to a Euclidean space.
For Matérn kernels: apparently an open problem.
Mateˊrn(κ22ν−Δ)2ν+4df=W Δ: Laplacian W: Gaussian white noise
The solution is a Gaussian process with kernel kν,κ,σ2(x,x′)=Cνσ2n=0∑∞(κ22ν−λn)−ν−2dfn(x)fn(x′)
Mesh the manifold, consider the discretized Laplace–Beltrami (a matrix).
Manifold: Lie group. Metric: Killing form. Laplacian ≡ Casimir.
Eigenfunctions ≡ matrix coefficients of unitary irreducible representations.
k(x,y)=Cνσ2n=0∑∞(κ22ν−λn)−ν−2dfn(x)fn(y)=Cνσ2π∑(κ22ν−λπ)−ν−2ddπχπ(y−1x).
Compute χπ: Weyl character formula. Compute λπ: Freudenthal’s formula.
Manifold: Hogenous space G/H. Metric: Inherited from the Lie group G.
Eigenfunctions ≡ spherical functions (for G/H=Sd—spherical harmonics).
Characters χπ are changed to zonal spherical functions ϕπ. For G/H=Sd — zonal spherical harmonics (certain Gegenbauer polynomials of distance).
Eigenvalues — same as for G.
k(x,y)=Cνσ2π∑(κ22ν−λπ)−ν−2ddπϕπ(y−1x).
The solution is a Gaussian process with kernel kν,κ,σ2(i,j)=Cνσ2n=0∑∣V∣−1(κ22ν+λn)−νfn(i)fn(j)
x0x1=x0+f(x0)Δtx2=x1+f(x1)Δt..
V. Borovitskiy, A. Terenin, P. Mostowsky, M. P. Deisenroth. Matérn Gaussian Processes on Riemannian Manifolds.
In Neural Information Processing Systems (NeurIPS) 2020.
V. Borovitskiy, I. Azangulov, A. Terenin, P. Mostowsky, M. P. Deisenroth. Matérn Gaussian Processes on Graphs.
In International Conference on Artificial Intelligence and Statistics (AISTATS) 2021.
N. Jaquier, V. Borovitskiy, A. Smolensky, A. Terenin, T. Asfour and L. Rozo. Geometry-aware Bayesian Optimization in Robotics using Riemannian Matérn Kernels. In Conference on Robot Learning (CoRL) 2021.
I. Azangulov, A. Smolensky, A. Terenin, V. Borovitskiy. Stationary Kernels and Gaussian Processes on Lie Groups and their Homogeneous Spaces I: the Compact Case. Preprint arXiv:2208.14960, 2022.
P. Whittle. On Stationary Processes in the Plane. In Biometrika, 1954.
F. Lindgren, H. Rue, J. Lindström. An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. In Journal of the Royal Statistical Society: Series B, 2011.
A. Feragen, F. Lauze, S. Hauberg. Geodesic exponential kernels: When curvature and linearity conflict. In IEEE conference on computer vision and pattern recognition (CVPR) 2015.
M. Deisenroth, C. E. Rasmussen. PILCO: A model-based and data-efficient approach to policy search. In International Conference on machine learning (ICML) 2011.
W. Neiswanger, K. A. Wang, S. Ermon. Bayesian algorithm execution: Estimating computable properties of black-box functions using mutual information. In International Conference on Machine Learning (ICML) 2021.
viacheslav.borovitskiy@gmail.com https://vab.im