During the past decade, Deep Learning (DL) algorithms, programming syste...
FP8 is a natural progression for accelerating deep learning training
inf...
Rapid advances in artificial intelligence (AI) technology have led to
si...
High-frequency ground motion simulations pose a grand challenge in
compu...
Convolutional neural networks (CNNs) have found many applications in tas...
Full-batch training on Graph Neural Networks (GNN) to learn the structur...
During the past decade, novel Deep Learning (DL) algorithms/workloads an...
Deep Neural Networks (DNNs) have revolutionized many aspects of our live...
During the last two years, the goal of many researchers has been to sque...
At the heart of deep learning training and inferencing are computational...
We implement a Tensor Train layer in the TensorFlow Neural Machine
Trans...
Deep learning (DL) is one of the most prominent branches of machine lear...
This paper presents the first comprehensive empirical study demonstratin...
In recent years fused-multiply-add (FMA) units with lower-precision
mult...
Domain specific accelerators present new challenges and opportunities fo...
Convolution layers are prevalent in many classes of deep neural networks...
The state-of-the-art (SOTA) for mixed precision training is dominated by...