AI for IT Operations (AIOps) on Cloud Platforms: Reviews, Opportunities and Challenges

04/10/2023
by   Qian Cheng, et al.
7

Artificial Intelligence for IT operations (AIOps) aims to combine the power of AI with the big data generated by IT Operations processes, particularly in cloud infrastructures, to provide actionable insights with the primary goal of maximizing availability. There are a wide variety of problems to address, and multiple use-cases, where AI capabilities can be leveraged to enhance operational efficiency. Here we provide a review of the AIOps vision, trends challenges and opportunities, specifically focusing on the underlying AI techniques. We discuss in depth the key types of data emitted by IT Operations activities, the scale and challenges in analyzing them, and where they can be helpful. We categorize the key AIOps tasks as - incident detection, failure prediction, root cause analysis and automated actions. We discuss the problem formulation for each task, and then present a taxonomy of techniques to solve these problems. We also identify relatively under explored topics, especially those that could significantly benefit from advances in AI literature. We also provide insights into the trends in this field, and what are the key investment opportunities.

READ FULL TEXT

page 1

page 2

page 4

research
12/15/2020

A Systematic Mapping Study in AIOps

IT systems of today are becoming larger and more complex, rendering thei...
research
01/24/2023

Automation and AI Technology in Surface Mining With a Brief Introduction to Open-Pit Operations in the Pilbara

This survey article provides a synopsis on some of the engineering probl...
research
10/30/2021

Sustainable AI: Environmental Implications, Challenges and Opportunities

This paper explores the environmental impact of the super-linear growth ...
research
05/31/2023

Traffic Prediction using Artificial Intelligence: Review of Recent Advances and Emerging Opportunities

Traffic prediction plays a crucial role in alleviating traffic congestio...
research
04/26/2023

Scalable, Distributed AI Frameworks: Leveraging Cloud Computing for Enhanced Deep Learning Performance and Efficiency

In recent years, the integration of artificial intelligence (AI) and clo...
research
12/14/2019

Ten AI Stepping Stones for Cybersecurity

With the turmoil in cybersecurity and the mind-blowing advances in AI, i...

Please sign up or login with your details

Forgot password? Click here to reset