
Towards Efficient Point Cloud Graph Neural Networks Through Architectural Simplification
In recent years graph neural network (GNN)based approaches have become ...
read it

S2TA: Exploiting Structured Sparsity for EnergyEfficient Mobile CNN Acceleration
Exploiting sparsity is a key technique in accelerating quantized convolu...
read it

On the Effects of Quantisation on Model Uncertainty in Bayesian Neural Networks
Bayesian neural networks (BNNs) are making significant progress in many ...
read it

Doping: A technique for efficient compression of LSTM models using sparse structured additive matrices
Structured matrices, such as those derived from Kronecker products (KP),...
read it

Information contraction in noisy binary neural networks and its implications
Neural networks have gained importance as the machine learning models th...
read it

MicroNets: Neural Network Architectures for Deploying TinyML Applications on Commodity Microcontrollers
Executing machine learning workloads locally on resource constrained mic...
read it

Rank and runtime aware compression of NLP Applications
Sequence model based NLP applications can be large. Yet, many applicatio...
read it

Sparse Systolic Tensor Array for Efficient CNN Hardware Acceleration
Convolutional neural network (CNN) inference on mobile devices demands e...
read it

High Throughput MatrixMatrix Multiplication between Asymmetric BitWidth Operands
Matrix multiplications between asymmetric bitwidth operands, especially...
read it

Efficient Residue Number System Based Winograd Convolution
Prior research has shown that Winograd algorithm can reduce the computat...
read it

TinyLSTMs: Efficient Neural Speech Enhancement for Hearing Aids
Modern speech enhancement algorithms achieve remarkable noise suppressio...
read it

Systolic Tensor Array: An Efficient StructuredSparse GEMM Accelerator for Mobile CNN Inference
Convolutional neural network (CNN) inference on mobile devices demands e...
read it

Searching for Winogradaware Quantized Networks
Lightweight architectural designs of Convolutional Neural Networks (CNNs...
read it

Compressing Language Models using Doped Kronecker Products
Kronecker Products (KP) have been used to compress IoT RNN Applications ...
read it

Noisy Machines: Understanding Noisy Neural Networks and Enhancing Robustness to Analog Hardware Errors Using Distillation
The success of deep learning has brought forth a wave of interest in com...
read it

ISP4ML: Understanding the Role of Image Signal Processing in Efficient Deep Learning Vision Systems
Convolutional neural networks (CNNs) are now predominant components in a...
read it

Ternary MobileNets via PerLayer Hybrid Filter Banks
MobileNets family of computer vision neural networks have fueled tremend...
read it

Pushing the limits of RNN Compression
Recurrent Neural Networks (RNN) can be difficult to deploy on resource c...
read it

RunTime Efficient RNN Compression for Inference on Edge Devices
Recurrent neural networks can be large and computeintensive, yet many a...
read it

Compressing RNNs for IoT devices by 1538x using Kronecker Products
Recurrent Neural Networks (RNN) can be large and computeintensive, maki...
read it

SpArSe: Sparse Architecture Search for CNNs on ResourceConstrained Microcontrollers
The vast majority of processors in the world are actually microcontrolle...
read it

Measuring scheduling efficiency of RNNs for NLP applications
Recurrent neural networks (RNNs) have shown state of the art results for...
read it

Ternary Hybrid NeuralTree Networks for Highly Constrained IoT Applications
Machine learningbased applications are increasingly prevalent in IoT de...
read it

Efficient Winograd or CookToom Convolution Kernel Implementation on Widely Used Mobile CPUs
The Winograd or CookToom class of algorithms help to reduce the overall...
read it

Learning lowprecision neural networks without StraightThrough Estimator(STE)
The StraightThrough Estimator (STE) is widely used for backpropagating...
read it

FixyNN: Efficient Hardware for Mobile Computer Vision via Transfer Learning
The computational demands of computer vision tasks based on stateofthe...
read it

Efficient and Robust Machine Learning for RealWorld Systems
While machine learning is traditionally a resource intensive task, embed...
read it

Energy Efficient Hardware for OnDevice CNN Inference via Transfer Learning
Ondevice CNN inference for realtime computer vision applications can r...
read it

SCALESim: Systolic CNN Accelerator Simulator
Systolic Arrays are one of the most popular compute substrates within De...
read it

SCALESim: Systolic CNN Accelerator
Systolic Arrays are one of the most popular compute substrates within De...
read it

Euphrates: AlgorithmSoC CoDesign for LowPower Mobile Continuous Vision
Continuous computer vision (CV) tasks increasingly rely on convolutional...
read it

Mobile Machine Learning Hardware at ARM: A SystemsonChip (SoC) Perspective
Machine learning is playing an increasingly significant role in emerging...
read it
Matthew Mattina
verfied profile