b'Yanqi Zhou'

research

∙ 06/27/2023

FLuRKA: Fast fused Low-Rank Kernel Attention

Many efficient approximate self-attention techniques have become prevale...

0 Ahan Gupta, et al. ∙

research

∙ 05/29/2023

Brainformers: Trading Simplicity for Efficiency

Transformers are central to recent successes in natural language process...

0 Yanqi Zhou, et al. ∙

research

∙ 05/24/2023

Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts

The explosive growth of language models and their applications have led ...

0 Sheng Shen, et al. ∙

research

∙ 05/23/2023

GiPH: Generalizable Placement Learning for Adaptive Heterogeneous Computing

Careful placement of a computational application within a target device ...

0 Yi Hu, et al. ∙

research

∙ 05/20/2023

Lifelong Language Pretraining with Distribution-Specialized Experts

Pretraining on a large-scale corpus has become a standard method to buil...

0 Wuyang Chen, et al. ∙

research

∙ 04/11/2023

Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference

We propose Conditional Adapter (CoDA), a parameter-efficient transfer le...

0 Tao Lei, et al. ∙

research

∙ 10/04/2022

Toward Edge-Efficient Dense Predictions with Synergistic Multi-Task Neural Architecture Search

In this work, we propose a novel and scalable solution to address the ch...

9 Thanh Vu, et al. ∙

research

∙ 04/09/2022

Searching for Efficient Neural Architectures for On-Device ML on Edge TPUs

On-device ML accelerators are becoming a standard in modern mobile syste...

0 Berkin Akin, et al. ∙

research

∙ 02/18/2022

Mixture-of-Experts with Expert Choice Routing

Sparsely-activated Mixture-of-experts (MoE) models allow the number of p...

0 Yanqi Zhou, et al. ∙

research

∙ 12/13/2021

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts

Scaling language models with more data, compute and parameters has drive...

4 Nan Du, et al. ∙

research

∙ 12/07/2021

A Transferable Approach for Partitioning Machine Learning Models on Multi-Chip-Modules

Multi-Chip-Modules (MCMs) reduce the design and fabrication cost of mach...

0 Xinfeng Xie, et al. ∙

research

∙ 02/23/2021

Do Transformer Modifications Transfer Across Implementations and Applications?

The research community has proposed copious modifications to the Transfo...

10 Sharan Narang, et al. ∙

research

∙ 02/17/2021

Rethinking Co-design of Neural Architectures and Hardware Accelerators

Neural architectures and hardware accelerators have been two driving for...

15 Yanqi Zhou, et al. ∙

research

∙ 02/02/2021

Apollo: Transferable Architecture Exploration

The looming end of Moore's Law and ascending use of deep learning drives...

6 Amir Yazdanbakhsh, et al. ∙

research

∙ 10/21/2020

Transferable Graph Optimizers for ML Compilers

Most compilers for machine learning (ML) frameworks need to solve many c...

0 Yanqi Zhou, et al. ∙

research

∙ 08/03/2020

A Learned Performance Model for Tensor Processing Units

Accurate hardware performance models are critical to efficient code gene...

0 Samuel J. Kaufman, et al. ∙

research

∙ 07/03/2020

ODE-CNN: Omnidirectional Depth Extension Networks

Omnidirectional 360 camera proliferates rapidly for autonomous robots si...

2 Xinjing Cheng, et al. ∙

research

∙ 10/23/2019

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Transfer learning, where a model is first pre-trained on a data-rich tas...

0 Colin Raffel, et al. ∙

research

∙ 09/28/2019

GDP: Generalized Device Placement for Dataflow Graphs

Runtime and scalability of large neural networks can be significantly af...

0 Yanqi Zhou, et al. ∙

research

∙ 07/07/2019

EPNAS: Efficient Progressive Neural Architecture Search

In this paper, we propose Efficient Progressive Neural Architecture Sear...

0 Yanqi Zhou, et al. ∙

research

∙ 06/12/2018

Resource-Efficient Neural Architect

Neural Architecture Search (NAS) is a laborious process. Prior work on a...

0 Yanqi Zhou, et al. ∙

research

∙ 02/14/2018

Neural Voice Cloning with a Few Samples

Voice cloning is a highly desired feature for personalized speech interf...

0 Sercan O. Arik, et al. ∙

research

∙ 12/01/2017

Deep Learning Scaling is Predictable, Empirically

Deep learning (DL) creates impactful advances following a virtuous recip...

0 Joel Hestness, et al. ∙

research

∙ 05/24/2017

Deep Voice 2: Multi-Speaker Neural Text-to-Speech

We introduce a technique for augmenting neural text-to-speech (TTS) with...

0 Sercan Arik, et al. ∙

Yanqi Zhou

Featured Co-authors

Sign in with Google

Consider DeepAI Pro