CoCoPIE XGen: A Full-Stack AI-Oriented Optimizing Framework

06/21/2022
by   Xiaofeng Li, et al.
4

There is a growing demand for shifting the delivery of AI capability from data centers on the cloud to edge or end devices, exemplified by the fast emerging real-time AI-based apps running on smartphones, AR/VR devices, autonomous vehicles, and various IoT devices. The shift has however been seriously hampered by the large growing gap between DNN computing demands and the computing power on edge or end devices. This article presents the design of XGen, an optimizing framework for DNN designed to bridge the gap. XGen takes cross-cutting co-design as its first-order consideration. Its full-stack AI-oriented optimizations consist of a number of innovative optimizations at every layer of the DNN software stack, all designed in a cooperative manner. The unique technology makes XGen able to optimize various DNNs, including those with an extreme depth (e.g., BERT, GPT, other transformers), and generate code that runs several times faster than those from existing DNN frameworks, while delivering the same level of accuracy.

READ FULL TEXT

page 4

page 13

page 16

page 17

page 19

page 20

page 24

page 26

research
08/04/2022

Leveraging the HW/SW Optimizations and Ecosystems that Drive the AI Revolution

This paper presents a state-of-the-art overview on how to architect, des...
research
01/08/2019

Collaborative Execution of Deep Neural Networks on Internet of Things Devices

With recent advancements in deep neural networks (DNNs), we are able to ...
research
10/06/2021

FTPipeHD: A Fault-Tolerant Pipeline-Parallel Distributed Training Framework for Heterogeneous Edge Devices

With the increased penetration and proliferation of Internet of Things (...
research
02/05/2018

Challenges: Bridge between Cloud and IoT

In the real time processing, a new emerging technology where the need of...
research
05/19/2022

Multi-DNN Accelerators for Next-Generation AI Systems

As the use of AI-powered applications widens across multiple domains, so...
research
11/07/2022

DeepFlow: A Cross-Stack Pathfinding Framework for Distributed AI Systems

Over the past decade, machine learning model complexity has grown at an ...
research
06/16/2022

The Case for a Wholistic Serverless Programming Paradigm and Full Stack Automation for AI and Beyond – The Philosophy of Jaseci and Jac

In this work, the case is made for a wholistic top-down re-envisioning o...

Please sign up or login with your details

Forgot password? Click here to reset