Designing Machine Learning Toolboxes: Concepts, Principles and Patterns

01/13/2021
by   Franz J Király, et al.
25

Machine learning (ML) and AI toolboxes such as scikit-learn or Weka are workhorses of contemporary data scientific practice – their central role being enabled by usable yet powerful designs that allow to easily specify, train and validate complex modeling pipelines. However, despite their universal success, the key design principles in their construction have never been fully analyzed. In this paper, we attempt to provide an overview of key patterns in the design of AI modeling toolboxes, taking inspiration, in equal parts, from the field of software engineering, implementation patterns found in contemporary toolboxes, and our own experience from developing ML toolboxes. In particular, we develop a conceptual model for the AI/ML domain, with a new type system, called scientific types, at its core. Scientific types capture the scientific meaning of common elements in ML workflows based on the set of operations that we usually perform with them (i.e. their interface) and their statistical properties. From our conceptual analysis, we derive a set of design principles and patterns. We illustrate that our analysis can not only explain the design of existing toolboxes, but also guide the development of new ones. We intend our contribution to be a state-of-art reference for future toolbox engineers, a summary of best practices, a collection of ML design patterns which may become useful for future research, and, potentially, the first steps towards a higher-level programming paradigm for constructing AI.

READ FULL TEXT

page 5

page 6

page 7

research
10/10/2019

Studying Software Engineering Patterns for Designing Machine Learning Systems

Machine-learning (ML) techniques have become popular in the recent years...
research
01/02/2022

Analysis of the Concepts of Plaster Decorations and Epigraphs in the Altar of the Great Mosque of Urmia

Ilkhanid art has a unique and special status in the arts of Islamic peri...
research
08/17/2021

Panoramic Learning with A Standardized Machine Learning Formalism

Machine Learning (ML) is about computational methods that enable machine...
research
09/30/2022

Empowering the trustworthiness of ML-based critical systems through engineering activities

This paper reviews the entire engineering process of trustworthy Machine...
research
05/30/2023

IDToolkit: A Toolkit for Benchmarking and Developing Inverse Design Algorithms in Nanophotonics

Aiding humans with scientific designs is one of the most exciting of art...
research
12/31/2015

Strategies and Principles of Distributed Machine Learning on Big Data

The rise of Big Data has led to new demands for Machine Learning (ML) sy...
research
12/08/2020

Statistical modeling: the three cultures

Two decades ago, Leo Breiman identified two cultures for statistical mod...

Please sign up or login with your details

Forgot password? Click here to reset