Towards Evaluating Exploratory Model Building Process with AutoML Systems

by   Sungsoo Ray Hong, et al.

The use of Automated Machine Learning (AutoML) systems are highly open-ended and exploratory. While rigorously evaluating how end-users interact with AutoML is crucial, establishing a robust evaluation methodology for such exploratory systems is challenging. First, AutoML is complex, including multiple sub-components that support a variety of sub-tasks for synthesizing ML pipelines, such as data preparation, problem specification, and model generation, making it difficult to yield insights that tell us which components were successful or not. Second, because the usage pattern of AutoML is highly exploratory, it is not possible to rely solely on widely used task efficiency and effectiveness metrics as success metrics. To tackle the challenges in evaluation, we propose an evaluation methodology that (1) guides AutoML builders to divide their AutoML system into multiple sub-system components, and (2) helps them reason about each component through visualization of end-users' behavioral patterns and attitudinal data. We conducted a study to understand when, how, why, and applying our methodology can help builders to better understand their systems and end-users. We recruited 3 teams of professional AutoML builders. The teams prepared their own systems and let 41 end-users use the systems. Using our methodology, we visualized end-users' behavioral and attitudinal data and distributed the results to the teams. We analyzed the results in two directions: what types of novel insights the AutoML builders learned from end-users, and (2) how the evaluation methodology helped the builders to understand workflows and the effectiveness of their systems. Our findings suggest new insights explaining future design opportunities in the AutoML domain as well as how using our methodology helped the builders to determine insights and let them draw concrete directions for improving their systems.


MLTEing Models: Negotiating, Evaluating, and Documenting Model and System Qualities

Many organizations seek to ensure that machine learning (ML) and artific...

What are Data Insights to Professional Visualization Users?

While many visualization researchers have attempted to define data insig...

Jump on the Bandwagon? – Characterizing Bandwagon Phenomenon in Online NBA Fan Communities

Understanding user dynamics in online communities has become an active r...

Narvis: Authoring Narrative Slideshows for Introducing Data Visualization Designs

Visual designs can be complex in modern data visualization systems, whic...

Interactive Evaluation of Dialog Track at DSTC9

The ultimate goal of dialog research is to develop systems that can be e...

Combining Text Mining and Visualization Techniques to Study Teams' Behavioral Processes

There is growing interest in mining software repository data to understa...

A Multi-level Methodology for Behavioral Comparison of Software-Intensive Systems

Software-intensive systems constantly evolve. To prevent software change...

Please sign up or login with your details

Forgot password? Click here to reset