What Programs Want: Automatic Inference of Input Data Specifications

07/21/2020
by   Caterina Urban, et al.
0

Nowadays, as machine-learned software quickly permeates our society, we are becoming increasingly vulnerable to programming errors in the data pre-processing or training software, as well as errors in the data itself. In this paper, we propose a static shape analysis framework for input data of data-processing programs. Our analysis automatically infers necessary conditions on the structure and values of the data read by a data-processing program. Our framework builds on a family of underlying abstract domains, extended to indirectly reason about the input data rather than simply reasoning about the program variables. The choice of these abstract domain is a parameter of the analysis. We describe various instances built from existing abstract domains. The proposed approach is implemented in an open-source static analyzer for Python programs. We demonstrate its potential on a number of representative examples.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/29/2019

Using Structured Input and Modularity for Improved Learning

We describe a method for utilizing the known structure of input data to ...
research
08/05/2021

An Abstract View of Big Data Processing Programs

This paper proposes a model for specifying data flow based parallel data...
research
07/30/2019

Computing Abstract Distances in Logic Programs

Abstract interpretation is a well-established technique for performing s...
research
09/21/2022

Interactive Abstract Interpretation: Reanalyzing Whole Programs for Cheap

To put static program analysis at the fingertips of the software develop...
research
11/09/2021

Learning Numerical Action Models from Noisy Input Data

This paper presents the PlanMiner-N algorithm, a domain learning techniq...
research
06/24/2021

An implementation of flow calculus for complexity analysis (tool paper)

Abstract. We present a tool to automatically perform the data-size analy...
research
07/29/2019

Goal-Driven Sequential Data Abstraction

Automatic data abstraction is an important capability for both benchmark...

Please sign up or login with your details

Forgot password? Click here to reset