DeepAI AI Chat
Log In Sign Up

Kan Extensions in Data Science and Machine Learning

by   Dan Shiebler, et al.

A common problem in data science is "use this function defined over this small set to generate predictions over that larger set." Extrapolation, interpolation, statistical inference and forecasting all reduce to this problem. The Kan extension is a powerful tool in category theory that generalizes this notion. In this work we explore several applications of Kan extensions to data science. We begin by deriving a simple classification algorithm as a Kan extension and experimenting with this algorithm on real data. Next, we use the Kan extension to derive a procedure for learning clustering algorithms from labels and explore the performance of this procedure on real data. We then investigate how Kan extensions can be used to learn a general mapping from datasets of labeled examples to functions and to approximate a complex function with a simpler one.


page 1

page 2

page 3

page 4


Human-Machine Collaboration for Democratizing Data Science

Everybody wants to analyse their data, but only few posses the data scie...

Statistical inference as Green's functions

Statistical inference from data is foundational task in science. Recentl...

Distance Geometry and Data Science

Data are often represented as graphs. Many common tasks in data science ...

Tropical Data Science

Phylogenomics is a new field which applies to tools in phylogenetics to ...

Introducing Variational Inference in Statistics and Data Science Curriculum

Probabilistic models such as logistic regression, Bayesian classificatio...

Machine Learning and Data Science approach towards trend and predictors analysis of CDC Mortality Data for the USA

The research on mortality is an active area of research for any country ...