Metric Hypertransformers are Universal Adapted Maps
We introduce a universal class of geometric deep learning models, called metric hypertransformers (MHTs), capable of approximating any adapted map F:π³^β€βπ΄^β€ with approximable complexity, where π³ββ^d and π΄ is any suitable metric space, and π³^β€ (resp. π΄^β€) capture all discrete-time paths on π³ (resp. π΄). Suitable spaces π΄ include various (adapted) Wasserstein spaces, all FrΓ©chet spaces admitting a Schauder basis, and a variety of Riemannian manifolds arising from information geometry. Even in the static case, where f:π³βπ΄ is a HΓΆlder map, our results provide the first (quantitative) universal approximation theorem compatible with any such π³ and π΄. Our universal approximation theorems are quantitative, and they depend on the regularity of F, the choice of activation function, the metric entropy and diameter of π³, and on the regularity of the compact set of paths whereon the approximation is performed. Our guiding examples originate from mathematical finance. Notably, the MHT models introduced here are able to approximate a broad range of stochastic processes' kernels, including solutions to SDEs, many processes with arbitrarily long memory, and functions mapping sequential data to sequences of forward rate curves.
READ FULL TEXT