Collective Knowledge: organizing research projects as a database of reusable components and portable workflows with common APIs

11/02/2020
by   Grigori Fursin, et al.
43

This article provides the motivation and overview of the Collective Knowledge framework (CK or cKnowledge). The CK concept is to decompose research projects into reusable components that encapsulate research artifacts and provide unified application programming interfaces (APIs), command-line interfaces (CLIs), meta descriptions and common automation actions for related artifacts. The CK framework is used to organize and manage research projects as a database of such components. Inspired by the USB "plug and play" approach for hardware, CK also helps to assemble portable workflows that can automatically plug in compatible components from different users and vendors (models, datasets, frameworks, compilers, tools). Such workflows can build and run algorithms on different platforms and environments in a unified way using the universal CK program pipeline with software detection plugins and the automatic installation of missing packages. This article presents a number of industrial projects in which the modular CK approach was successfully validated in order to automate benchmarking, auto-tuning and co-design of efficient software and hardware for machine learning (ML) and artificial intelligence (AI) in terms of speed, accuracy, energy, size and various costs. The CK framework also helped to automate the artifact evaluation process at several computer science conferences as well as to make it easier to reproduce, compare and reuse research techniques from published papers, deploy them in production, and automatically adapt them to continuously changing datasets, models and systems. The long-term goal is to accelerate innovation by connecting researchers and practitioners to share and reuse all their knowledge, best practices, artifacts, workflows and experimental results in a common, portable and reproducible format at https://cKnowledge.io .

READ FULL TEXT

page 2

page 4

page 7

page 8

page 9

page 13

page 14

research
06/12/2020

The Collective Knowledge project: making ML models more portable and reproducible with open APIs, reusable best practices and MLOps

This article provides an overview of the Collective Knowledge technology...
research
03/31/2019

SysML'19 demo: customizable and reusable Collective Knowledge pipelines to automate and reproduce machine learning experiments

Reproducing, comparing and reusing results from machine learning and sys...
research
03/13/2022

BioSimulators: a central registry of simulation engines and services for recommending specific tools

Computational models have great potential to accelerate bioscience, bioe...
research
01/22/2020

CodeReef: an open platform for portable MLOps, reusable automation actions and reproducible benchmarking

We present CodeReef - an open platform to share all the components neces...
research
01/19/2018

Introducing ReQuEST: an Open Platform for Reproducible and Quality-Efficient Systems-ML Tournaments

Co-designing efficient machine learning based systems across the whole h...
research
01/19/2018

A Collective Knowledge workflow for collaborative research into multi-objective autotuning and machine learning techniques

Developing efficient software and hardware has never been harder whether...

Please sign up or login with your details

Forgot password? Click here to reset