MLOS: An Infrastructure for AutomatedSoftware Performance Engineering

06/01/2020
by   Carlo Curino, et al.
0

Developing modern systems software is a complex task that combines business logic programming and Software Performance Engineering (SPE). The later is an experimental and labor-intensive activity focused on optimizing the system for a given hardware, software, and workload (hw/sw/wl) context. Today's SPE is performed during build/release phases by specialized teams, and cursed by: 1) lack of standardized and automated tools, 2) significant repeated work as hw/sw/wl context changes, 3) fragility induced by a "one-size-fit-all" tuning (where improvements on one workload or component may impact others). The net result: despite costly investments, system software is often outside its optimal operating point - anecdotally leaving 30 performance on the table. The recent developments in Data Science (DS) hints at an opportunity: combining DS tooling and methodologies with a new developer experience to transform the practice of SPE. In this paper we present: MLOS, an ML-powered infrastructure and methodology to democratize and automate Software Performance Engineering. MLOS enables continuous, instance-level, robust, and trackable systems optimization. MLOS is being developed and employed within Microsoft to optimize SQL Server performance. Early results indicated that component-level optimizations can lead to 20 specific hw/sw/wl, hinting at a significant opportunity. However, several research challenges remain that will require community involvement. To this end, we are in the process of open-sourcing the MLOS core infrastructure, and we are engaging with academic institutions to create an educational program around Software 2.0 and MLOS ideas.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/01/2020

MLOS: An Infrastructure for Automated Software Performance Engineering

Developing modern systems software is a complex task that combines busin...
research
01/18/2020

Teaching Software Engineering for AI-Enabled Systems

Software engineers have significant expertise to offer when building int...
research
03/02/2021

On a Factorial Knowledge Architecture for Data Science-powered Software Engineering

Given the data-intensive and collaborative trend in science, the softwar...
research
09/14/2023

Identifying Concerns When Specifying Machine Learning-Enabled Systems: A Perspective-Based Approach

Engineering successful machine learning (ML)-enabled systems poses vario...
research
10/11/2021

Beyond Desktop Computation: Challenges in Scaling a GPU Infrastructure

Enterprises and labs performing computationally expensive data science a...
research
04/17/2020

Automated System Performance Testing at MongoDB

Distributed Systems Infrastructure (DSI) is MongoDB's framework for runn...
research
01/25/2021

Creating a Virtuous Cycle in Performance Testing at MongoDB

It is important to detect changes in software performance during develop...

Please sign up or login with your details

Forgot password? Click here to reset