Principles for Evaluation of AI/ML Model Performance and Robustness

07/06/2021
by   Olivia Brown, et al.
0

The Department of Defense (DoD) has significantly increased its investment in the design, evaluation, and deployment of Artificial Intelligence and Machine Learning (AI/ML) capabilities to address national security needs. While there are numerous AI/ML successes in the academic and commercial sectors, many of these systems have also been shown to be brittle and nonrobust. In a complex and ever-changing national security environment, it is vital that the DoD establish a sound and methodical process to evaluate the performance and robustness of AI/ML models before these new capabilities are deployed to the field. This paper reviews the AI/ML development process, highlights common best practices for AI/ML model evaluation, and makes recommendations to DoD evaluators to ensure the deployment of robust AI/ML capabilities for national security needs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/24/2020

Advancing the Research and Development of Assured Artificial Intelligence and Machine Learning Capabilities

Artificial intelligence (AI) and machine learning (ML) have become incre...
research
11/09/2022

Supporting AI/ML Security Workers through an Adversarial Techniques, Tools, and Common Knowledge (AI/ML ATT CK) Framework

This paper focuses on supporting AI/ML Security Workers – professionals ...
research
07/14/2020

Our Evaluation Metric Needs an Update to Encourage Generalization

Models that surpass human performance on several popular benchmarks disp...
research
04/27/2021

Proceedings - AI/ML for Cybersecurity: Challenges, Solutions, and Novel Ideas at SIAM Data Mining 2021

Malicious cyber activity is ubiquitous and its harmful effects have dram...
research
01/20/2022

Assembling a Cyber Range to Evaluate Artificial Intelligence / Machine Learning (AI/ML) Security Tools

In this case study, we describe the design and assembly of a cyber secur...
research
06/10/2021

How Robust are Model Rankings: A Leaderboard Customization Approach for Equitable Evaluation

Models that top leaderboards often perform unsatisfactorily when deploye...
research
09/15/2022

Power to the People? Opportunities and Challenges for Participatory AI

Participatory approaches to artificial intelligence (AI) and machine lea...

Please sign up or login with your details

Forgot password? Click here to reset