Many efficient approximate self-attention techniques have become prevale...
Transformers are central to recent successes in natural language process...
The explosive growth of language models and their applications have led ...
Careful placement of a computational application within a target device
...
Pretraining on a large-scale corpus has become a standard method to buil...
We propose Conditional Adapter (CoDA), a parameter-efficient transfer
le...
In this work, we propose a novel and scalable solution to address the
ch...
On-device ML accelerators are becoming a standard in modern mobile
syste...
Sparsely-activated Mixture-of-experts (MoE) models allow the number of
p...
Scaling language models with more data, compute and parameters has drive...
Multi-Chip-Modules (MCMs) reduce the design and fabrication cost of mach...
The research community has proposed copious modifications to the Transfo...
Neural architectures and hardware accelerators have been two driving for...
The looming end of Moore's Law and ascending use of deep learning drives...
Most compilers for machine learning (ML) frameworks need to solve many
c...
Accurate hardware performance models are critical to efficient code
gene...
Omnidirectional 360 camera proliferates rapidly for autonomous robots
si...
Transfer learning, where a model is first pre-trained on a data-rich tas...
Runtime and scalability of large neural networks can be significantly
af...
In this paper, we propose Efficient Progressive Neural Architecture Sear...
Neural Architecture Search (NAS) is a laborious process. Prior work on
a...
Voice cloning is a highly desired feature for personalized speech interf...
Deep learning (DL) creates impactful advances following a virtuous recip...
We introduce a technique for augmenting neural text-to-speech (TTS) with...