Kan Extensions in Data Science and Machine Learning

03/17/2022
by   Dan Shiebler, et al.
52

A common problem in data science is "use this function defined over this small set to generate predictions over that larger set." Extrapolation, interpolation, statistical inference and forecasting all reduce to this problem. The Kan extension is a powerful tool in category theory that generalizes this notion. In this work we explore several applications of Kan extensions to data science. We begin by deriving a simple classification algorithm as a Kan extension and experimenting with this algorithm on real data. Next, we use the Kan extension to derive a procedure for learning clustering algorithms from labels and explore the performance of this procedure on real data. We then investigate how Kan extensions can be used to learn a general mapping from datasets of labeled examples to functions and to approximate a complex function with a simpler one.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/02/2021

The Art and Practice of Data Science Pipelines: A Comprehensive Study of Data Science Pipelines In Theory, In-The-Small, and In-The-Large

Increasingly larger number of software systems today are including data ...
research
04/23/2020

Human-Machine Collaboration for Democratizing Data Science

Everybody wants to analyse their data, but only few posses the data scie...
research
05/23/2022

Statistical inference as Green's functions

Statistical inference from data is foundational task in science. Recentl...
research
09/18/2019

Distance Geometry and Data Science

Data are often represented as graphs. Many common tasks in data science ...
research
05/13/2020

Tropical Data Science

Phylogenomics is a new field which applies to tools in phylogenetics to ...
research
01/03/2023

Introducing Variational Inference in Statistics and Data Science Curriculum

Probabilistic models such as logistic regression, Bayesian classificatio...
research
09/11/2020

Machine Learning and Data Science approach towards trend and predictors analysis of CDC Mortality Data for the USA

The research on mortality is an active area of research for any country ...

Please sign up or login with your details

Forgot password? Click here to reset