DeepAI AI Chat
Log In Sign Up

Kan Extensions in Data Science and Machine Learning

03/17/2022
by   Dan Shiebler, et al.
52

A common problem in data science is "use this function defined over this small set to generate predictions over that larger set." Extrapolation, interpolation, statistical inference and forecasting all reduce to this problem. The Kan extension is a powerful tool in category theory that generalizes this notion. In this work we explore several applications of Kan extensions to data science. We begin by deriving a simple classification algorithm as a Kan extension and experimenting with this algorithm on real data. Next, we use the Kan extension to derive a procedure for learning clustering algorithms from labels and explore the performance of this procedure on real data. We then investigate how Kan extensions can be used to learn a general mapping from datasets of labeled examples to functions and to approximate a complex function with a simpler one.

READ FULL TEXT

page 1

page 2

page 3

page 4

04/23/2020

Human-Machine Collaboration for Democratizing Data Science

Everybody wants to analyse their data, but only few posses the data scie...
05/23/2022

Statistical inference as Green's functions

Statistical inference from data is foundational task in science. Recentl...
09/18/2019

Distance Geometry and Data Science

Data are often represented as graphs. Many common tasks in data science ...
05/13/2020

Tropical Data Science

Phylogenomics is a new field which applies to tools in phylogenetics to ...
01/03/2023

Introducing Variational Inference in Statistics and Data Science Curriculum

Probabilistic models such as logistic regression, Bayesian classificatio...
09/11/2020

Machine Learning and Data Science approach towards trend and predictors analysis of CDC Mortality Data for the USA

The research on mortality is an active area of research for any country ...