Toward Formal Data Set Verification for Building Effective Machine Learning Models

08/25/2021
by   Jorge López, et al.
0

In order to properly train a machine learning model, data must be properly collected. To guarantee a proper data collection, verifying that the collected data set holds certain properties is a possible solution. For example, guaranteeing that the data set contains samples across the whole input space, or that the data set is balanced w.r.t. different classes. We present a formal approach for verifying a set of arbitrarily stated properties over a data set. The proposed approach relies on the transformation of the data set into a first order logic formula, which can be later verified w.r.t. the different properties also stated in the same logic. A prototype tool, which uses the z3 solver, has been developed; the prototype can take as an input a set of properties stated in a formal language and formally verify a given data set w.r.t. to the given set of properties. Preliminary experimental results show the feasibility and performance of the proposed approach, and furthermore the flexibility for expressing properties of interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/21/2020

On using SMT-solvers for Modeling and Verifying Dynamic Network Emulators

A novel model-based approach to verify dynamic networks is proposed; the...
research
03/01/2020

Advanced kNN: A Mature Machine Learning Series

k-nearest neighbour (kNN) is one of the most prominent, simple and basic...
research
05/06/2021

Scaling up Memory-Efficient Formal Verification Tools for Tree Ensembles

To guarantee that machine learning models yield outputs that are not onl...
research
03/25/2023

Verifying Properties of Tsetlin Machines

Tsetlin Machines (TsMs) are a promising and interpretable machine learni...
research
03/11/2019

Towards Deriving Verification Properties

Formal software verification uses mathematical techniques to establish t...
research
11/25/2019

CAMUS: A Framework to Build Formal Specifications for Deep Perception Systems Using Simulators

The topic of provable deep neural network robustness has raised consider...
research
09/05/2022

Exploring the Verifiability of Code Generated by GitHub Copilot

GitHub's Copilot generates code quickly. We investigate whether it gener...

Please sign up or login with your details

Forgot password? Click here to reset