Log In Sign Up

Special Session: Reliability Analysis for ML/AI Hardware

by   Shamik Kundu, et al.

Artificial intelligence (AI) and Machine Learning (ML) are becoming pervasive in today's applications, such as autonomous vehicles, healthcare, aerospace, cybersecurity, and many critical applications. Ensuring the reliability and robustness of the underlying AI/ML hardware becomes our paramount importance. In this paper, we explore and evaluate the reliability of different AI/ML hardware. The first section outlines the reliability issues in a commercial systolic array-based ML accelerator in the presence of faults engendering from device-level non-idealities in the DRAM. Next, we quantified the impact of circuit-level faults in the MSB and LSB logic cones of the Multiply and Accumulate (MAC) block of the AI accelerator on the AI/ML accuracy. Finally, we present two key reliability issues – circuit aging and endurance in emerging neuromorphic hardware platforms and present our system-level approach to mitigate them.


page 1

page 8


Towards Resilient Artificial Intelligence: Survey and Research Issues

Artificial intelligence (AI) systems are becoming critical components of...

AI-powered Language Assessment Tools for Dementia

The main objective of this paper is to propose an approach for developin...

Measuring AI Systems Beyond Accuracy

Current test and evaluation (T E) methods for assessing machine learni...

Open Challenges and Issues: Artificial Intelligence for Transactive Management

The advancement of Artificial Intelligence (AI) has improved the automat...

Risk Management of AI/ML Software as a Medical Device (SaMD): On ISO 14971 and Related Standards and Guidances

Safety and efficacy are the paramount objectives of medical device regul...

Advancing Computing's Foundation of US Industry Society

While past information technology (IT) advances have transformed society...