Careful placement of a computational application within a target device
...
Pretraining on a large-scale corpus has become a standard method to buil...
Sparsely-activated Mixture-of-experts (MoE) models allow the number of
p...
Multi-Chip-Modules (MCMs) reduce the design and fabrication cost of mach...
Edge TPUs are a domain of accelerators for low-power, edge devices and a...
Neural architectures and hardware accelerators have been two driving for...
The looming end of Moore's Law and ascending use of deep learning drives...
Most compilers for machine learning (ML) frameworks need to solve many
c...
In this work, we present a learning-based approach to chip placement, on...
Runtime and scalability of large neural networks can be significantly
af...
Many architects believe that major improvements in cost-energy-performan...