b'Song Han'

research

∙ 09/21/2023

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

We present LongLoRA, an efficient fine-tuning approach that extends the ...

0 Yukang Chen, et al. ∙

research

∙ 06/15/2023

Retrospective: EIE: Efficient Inference Engine on Sparse and Compressed Neural Network

EIE proposed to accelerate pruned and compressed neural networks, exploi...

0 Song Han, et al. ∙

research

∙ 06/01/2023

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Large language models (LLMs) have shown excellent performance on various...

0 Ji Lin, et al. ∙

research

∙ 05/31/2023

DOTA: A Dynamically-Operated Photonic Tensor Core for Energy-Efficient Transformer Accelerator

The wide adoption and significant computing resource consumption of atte...

0 Hanqing Zhu, et al. ∙

research

∙ 05/26/2023

Real-Time Scheduling for Time-Sensitive Networking: A Systematic Review and Experimental Study

Time-Sensitive Networking (TSN) has been recognized as one of the key en...

0 Chuanyu Xue, et al. ∙

research

∙ 05/17/2023

FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention

Diffusion models excel at text-to-image generation, especially in subjec...

0 Guangxuan Xiao, et al. ∙

research

∙ 03/30/2023

SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer

High-resolution images enable neural networks to learn richer visual rep...

1 Xuanyao Chen, et al. ∙

research

∙ 02/09/2023

Offsite-Tuning: Transfer Learning without Full Model

Transfer learning is important for foundation models to adapt to downstr...

0 Guangxuan Xiao, et al. ∙

research

∙ 01/20/2023

FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer

Transformer, as an alternative to CNN, has been proven effective in many...

0 Zhijian Liu, et al. ∙

research

∙ 11/18/2022

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Large language models (LLMs) show excellent performance but are compute-...

0 Guangxuan Xiao, et al. ∙

research

∙ 11/03/2022

Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models

During image editing, existing deep generative models tend to re-synthes...

6 Muyang Li, et al. ∙

research

∙ 10/30/2022

QuEst: Graph Transformer for Quantum Circuit Reliability Estimation

Among different quantum algorithms, PQC for QML show promises on near-te...

0 Hanrui Wang, et al. ∙

research

∙ 10/15/2022

TopGen: Topology-Aware Bottom-Up Generator for Variational Quantum Circuits

Variational Quantum Algorithms (VQA) are promising to demonstrate quantu...

0 Jinglei Cheng, et al. ∙

research

∙ 08/02/2022

PAN: Pulse Ansatz on NISQ Machines

Variational quantum algorithms (VQAs) have demonstrated great potentials...

0 Zhiding Liang, et al. ∙

research

∙ 07/13/2022

RobustAnalog: Fast Variation-Aware Analog Circuit Design Via Multi-task RL

Analog/mixed-signal circuit design is one of the most complex and time-c...

1 Wei Shi, et al. ∙

research

∙ 06/30/2022

On-Device Training Under 256KB Memory

On-device training enables the model to adapt to new data collected from...

14 Ji Lin, et al. ∙

research

∙ 06/19/2022

MME-CRS: Multi-Metric Evaluation Based on Correlation Re-Scaling for Evaluating Open-Domain Dialogue

Automatic open-domain dialogue evaluation is a crucial component of dial...

0 Pengfei Zhang, et al. ∙

research

∙ 05/29/2022

EfficientViT: Enhanced Linear Attention for High-Resolution Low-Computation Visual Recognition

Vision Transformer (ViT) has achieved remarkable performance in many vis...

0 Han Cai, et al. ∙

research

∙ 05/26/2022

BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation

Multi-sensor fusion is essential for an accurate and reliable autonomous...

21 Zhijian Liu, et al. ∙

research

∙ 05/03/2022

Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation

Pose estimation plays a critical role in human-centered vision applicati...

0 Yihan Wang, et al. ∙

research

∙ 04/25/2022

PVNAS: 3D Neural Architecture Search with Point-Voxel Convolution

3D neural networks are widely used in real-world applications (e.g., AR/...

0 Zhijian Liu, et al. ∙

research

∙ 04/25/2022

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Deep neural networks (DNNs) have achieved unprecedented success in the f...

0 Han Cai, et al. ∙

research

∙ 04/21/2022

TorchSparse: Efficient Point Cloud Inference Engine

Deep learning on point clouds has received increased attention thanks to...

14 Haotian Tang, et al. ∙

research

∙ 02/26/2022

QOC: Quantum On-Chip Training with Parameter Shift and Gradient Pruning

Parameterized Quantum Circuits (PQC) are drawing increasing research int...

0 Hanrui Wang, et al. ∙

research

∙ 12/27/2021

AET-SGD: Asynchronous Event-triggered Stochastic Gradient Descent

Communication cost is the main bottleneck for the design of effective di...

0 Nhuong Nguyen, et al. ∙

research

∙ 11/23/2021

VISTA 2.0: An Open, Data-driven Simulator for Multimodal Sensing and Policy Learning for Autonomous Vehicles

Simulation has the potential to transform the development of robust algo...

7 Alexander Amini, et al. ∙

research

∙ 10/28/2021

MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning

Tiny deep learning on microcontroller units (MCUs) is challenging due to...

19 Ji Lin, et al. ∙

research

∙ 10/21/2021

RoQNN: Noise-Aware Training for Robust Quantum Neural Networks

Quantum Neural Network (QNN) is a promising application towards quantum ...

0 Hanrui Wang, et al. ∙

research

∙ 10/17/2021

Network Augmentation for Tiny Deep Learning

We introduce Network Augmentation (NetAug), a new training method for im...

0 Han Cai, et al. ∙

research

∙ 10/14/2021

PointAcc: Efficient Point Cloud Accelerator

Deep learning on point clouds plays a vital role in a wide range of appl...

0 Yujun Lin, et al. ∙

research

∙ 09/27/2021

TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device

The explosive growth in video streaming requires video understanding at ...

0 Ji Lin, et al. ∙

research

∙ 08/26/2021

LocTex: Learning Data-Efficient Visual Representations from Localized Textual Supervision

Computer vision tasks such as object detection and semantic/instance seg...

15 Zhijian Liu, et al. ∙

research

∙ 07/22/2021

QuantumNAS: Noise-Adaptive Search for Robust Quantum Circuits

Quantum noise is the key challenge in Noisy Intermediate-Scale Quantum (...

40 Hanrui Wang, et al. ∙

research

∙ 05/27/2021

NAAS: Neural Accelerator Architecture Search

Data-driven, automatic design space exploration of neural accelerator ar...

0 Yujun Lin, et al. ∙

research

∙ 05/20/2021

Efficient and Robust LiDAR-Based End-to-End Navigation

Deep learning has been used to demonstrate end-to-end neural network lea...

7 Zhijian Liu, et al. ∙

research

∙ 03/10/2021

PatchNet – Short-range Template Matching for Efficient Video Processing

Object recognition is a fundamental problem in many video processing tas...

31 Huizi Mao, et al. ∙

research

∙ 03/04/2021

Anycost GANs for Interactive Image Synthesis and Editing

Generative adversarial networks (GANs) have enabled photorealistic image...

23 Ji Lin, et al. ∙

research

∙ 12/17/2020

SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning

The attention mechanism is becoming increasingly popular in Natural Lang...

0 Hanrui Wang, et al. ∙

research

∙ 11/02/2020

IOS: Inter-Operator Scheduler for CNN Acceleration

To accelerate CNN inference, existing deep learning frameworks focus on ...

0 Yaoyao Ding, et al. ∙

research

∙ 08/11/2020

Hardware-Centric AutoML for Mixed-Precision Quantization

Model quantization is a widely used technique to compress and accelerate...

7 Kuan Wang, et al. ∙

research

∙ 07/31/2020

Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution

Self-driving cars need to understand 3D scenes efficiently and accuratel...

4 Haotian Tang, et al. ∙

research

∙ 07/22/2020

Tiny Transfer Learning: Towards Memory-Efficient On-Device Learning

We present Tiny-Transfer-Learning (TinyTL), an efficient on-device learn...

21 Han Cai, et al. ∙

research

∙ 07/20/2020

MCUNet: Tiny Deep Learning on IoT Devices

Machine learning on tiny IoT devices based on microcontroller units (MCU...

0 Ji Lin, et al. ∙

research

∙ 06/18/2020

Differentiable Augmentation for Data-Efficient GAN Training

The performance of generative adversarial networks (GANs) heavily deteri...

11 Shengyu Zhao, et al. ∙

research

∙ 06/15/2020

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

We present APQ for efficient deep learning inference on resource-constra...

0 Tianzhe Wang, et al. ∙

research

∙ 05/28/2020

HAT: Hardware-Aware Transformers for Efficient Natural Language Processing

Transformers are ubiquitous in Natural Language Processing (NLP) tasks, ...

24 Hanrui Wang, et al. ∙

research

∙ 05/16/2020

MicroNet for Efficient Language Modeling

It is important to design compact language models for efficient deployme...

0 Zhongxia Yan, et al. ∙

research

∙ 04/30/2020

GCN-RL Circuit Designer: Transferable Transistor Sizing with Graph Neural Networks and Reinforcement Learning

Automatic transistor sizing is a challenging problem in circuit design d...

0 Hanrui Wang, et al. ∙

research

∙ 04/24/2020

Lite Transformer with Long-Short Range Attention

Transformer has become ubiquitous in natural language processing (e.g., ...

0 Zhanghao Wu, et al. ∙

research

∙ 04/12/2020

A Fast Algorithm for Source-Wise Round-Trip Spanners

In this paper, we study the problem of efficiently constructing source-w...

0 Chun Jiang Zhu, et al. ∙

Song Han

Featured Co-authors

Sign in with Google

Consider DeepAI Pro