Multisource AI Scorecard Table for System Evaluation

by   Erik Blasch, et al.

The paper describes a Multisource AI Scorecard Table (MAST) that provides the developer and user of an artificial intelligence (AI)/machine learning (ML) system with a standard checklist focused on the principles of good analysis adopted by the intelligence community (IC) to help promote the development of more understandable systems and engender trust in AI outputs. Such a scorecard enables a transparent, consistent, and meaningful understanding of AI tools applied for commercial and government use. A standard is built on compliance and agreement through policy, which requires buy-in from the stakeholders. While consistency for testing might only exist across a standard data set, the community requires discussion on verification and validation approaches which can lead to interpretability, explainability, and proper use. The paper explores how the analytic tradecraft standards outlined in Intelligence Community Directive (ICD) 203 can provide a framework for assessing the performance of an AI system supporting various operational needs. These include sourcing, uncertainty, consistency, accuracy, and visualization. Three use cases are presented as notional examples that support security for comparative analysis.


page 5

page 6

page 7


Artificial Intelligence Strategies for National Security and Safety Standards

Recent advances in artificial intelligence (AI) have lead to an explosio...

Tools and Practices for Responsible AI Engineering

Responsible Artificial Intelligence (AI) - the practice of developing, e...

Increasing Trust in AI Services through Supplier's Declarations of Conformity

The accuracy and reliability of machine learning algorithms are an impor...

Certifiable Artificial Intelligence Through Data Fusion

This paper reviews and proposes concerns in adopting, fielding, and main...

VERIFAI: A Toolkit for the Design and Analysis of Artificial Intelligence-Based Systems

We present VERIFAI, a software toolkit for the formal design and analysi...

AI Explainability 360: Impact and Design

As artificial intelligence and machine learning algorithms become increa...

Gliders2012: Development and Competition Results

The RoboCup 2D Simulation League incorporates several challenging featur...