b'Mu Li'

research

∙ 07/19/2023

PreDiff: Precipitation Nowcasting with Latent Diffusion Models

Earth system forecasting has traditionally relied on complex physical mo...

1 Zhihan Gao, et al. ∙

research

∙ 05/16/2023

Tailoring Instructions to Student's Learning Levels Boosts Knowledge Distillation

It has been commonly observed that a teacher model with superior perform...

3 Yuxin Ren, et al. ∙

research

∙ 05/10/2023

XTab: Cross-table Pretraining for Tabular Transformers

The success of self-supervised learning in computer vision and natural l...

0 Bingzhao Zhu, et al. ∙

research

∙ 05/04/2023

Scanpath Prediction in Panoramic Videos via Expected Code Length Minimization

Predicting human scanpaths when exploring panoramic videos is a challeng...

0 Mu Li, et al. ∙

research

∙ 04/10/2023

A Cheaper and Better Diffusion Language Model with Soft-Masked Noise

Diffusion models that are based on iterative denoising have been recentl...

5 Jiaao Chen, et al. ∙

research

∙ 04/10/2023

Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition

This work proposes POMP, a prompt pre-training method for vision-languag...

5 Shuhuai Ren, et al. ∙

research

∙ 03/08/2023

RAF: Holistic Compilation for Deep Learning Model Training

As deep learning is pervasive in modern applications, many deep learning...

0 Cody Hao Yu, et al. ∙

research

∙ 02/16/2023

LayoutDiffuse: Adapting Foundational Diffusion Models for Layout-to-Image Generation

Layout-to-image generation refers to the task of synthesizing photo-real...

0 Jiaxin Cheng, et al. ∙

research

∙ 02/02/2023

Multimodal Chain-of-Thought Reasoning in Language Models

Large language models (LLMs) have shown impressive performance on comple...

0 Zhuosheng Zhang, et al. ∙

research

∙ 01/04/2023

Parameter-Efficient Fine-Tuning Design Spaces

Parameter-efficient fine-tuning aims to achieve performance comparable t...

1 Jiaao Chen, et al. ∙

research

∙ 12/29/2022

Learning Multimodal Data Augmentation in Feature Space

The ability to jointly learn from multiple modalities, such as text, aud...

14 Zichang Liu, et al. ∙

research

∙ 12/21/2022

What Makes for Good Tokenizers in Vision Transformer?

The architecture of transformers, which recently witness booming applica...

10 Shengju Qian, et al. ∙

research

∙ 12/21/2022

SPT: Semi-Parametric Prompt Tuning for Multitask Prompted Learning

Pre-trained large language models can efficiently interpolate human-writ...

8 M Saiful Bari, et al. ∙

research

∙ 12/15/2022

Are Multimodal Models Robust to Image and Text Perturbations?

Multimodal image-text models have shown remarkable performance in the pa...

11 Jielin Qiu, et al. ∙

research

∙ 10/10/2022

Visual Prompt Tuning for Test-time Domain Adaptation

Models should have the ability to adapt to unseen data during test-time ...

9 Yunhe Gao, et al. ∙

research

∙ 10/07/2022

Automatic Chain of Thought Prompting in Large Language Models

Large language models (LLMs) can perform complex reasoning by generating...

0 Zhuosheng Zhang, et al. ∙

research

∙ 08/17/2022

An Efficient Coarse-to-Fine Facet-Aware Unsupervised Summarization Framework based on Semantic Blocks

Unsupervised summarization methods have achieved remarkable results by i...

0 Xinnian Liang, et al. ∙

research

∙ 07/12/2022

Earthformer: Exploring Space-Time Transformers for Earth System Forecasting

Conventionally, Earth system (e.g., weather and climate) forecasting rel...

32 Zhihan Gao, et al. ∙

research

∙ 07/04/2022

Partial and Asymmetric Contrastive Learning for Out-of-Distribution Detection in Long-Tailed Recognition

Existing out-of-distribution (OOD) detection methods are typically bench...

7 Haotao Wang, et al. ∙

research

∙ 07/04/2022

Removing Batch Normalization Boosts Adversarial Training

Adversarial training (AT) defends deep neural networks against adversari...

0 Haotao Wang, et al. ∙

research

∙ 06/16/2022

MixGen: A New Multi-Modal Data Augmentation

Data augmentation is a necessity to enhance data efficiency in deep lear...

36 Xiaoshuai Hao, et al. ∙

research

∙ 04/30/2022

MiCS: Near-linear Scaling for Training Gigantic Model on Public Cloud

Existing general purpose frameworks for gigantic model training, i.e., m...

0 Zhen Zhang, et al. ∙

research

∙ 04/09/2022

Modeling Multi-Granularity Hierarchical Features for Relation Extraction

Relation extraction is a key task in Natural Language Processing (NLP), ...

0 Xinnian Liang, et al. ∙

research

∙ 03/24/2022

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

Multiple datasets and open challenges for object detection have been int...

5 Likun Cai, et al. ∙

research

∙ 03/22/2022

Task-guided Disentangled Tuning for Pretrained Language Models

Pretrained language models (PLMs) trained on large-scale unlabeled corpu...

0 Jiali Zeng, et al. ∙

research

∙ 03/22/2022

Learning Confidence for Transformer-based Neural Machine Translation

Confidence estimation aims to quantify the confidence of the model predi...

0 Yu Lu, et al. ∙

research

∙ 12/25/2021

Pseudocylindrical Convolutions for Learned Omnidirectional Image Compression

Although equirectangular projection (ERP) is a convenient form to store ...

5 Mu Li, et al. ∙

research

∙ 11/04/2021

Benchmarking Multimodal AutoML for Tabular Data with Text Fields

We consider the use of automated supervised learning systems for data ta...

0 Xingjian Shi, et al. ∙

research

∙ 10/28/2021

Blending Anti-Aliasing into Vision Transformer

The transformer architectures, based on self-attention mechanism and con...

19 Shengju Qian, et al. ∙

research

∙ 09/23/2021

Distiller: A Systematic Study of Model Distillation Methods in Natural Language Processing

We aim to identify how different components in the KD pipeline affect th...

0 Haoyu He, et al. ∙

research

∙ 09/15/2021

Unsupervised Keyphrase Extraction by Jointly Modeling Local and Global Context

Embedding based methods are widely used for unsupervised keyphrase extra...

0 Xinnian Liang, et al. ∙

research

∙ 08/12/2021

Progressive Coordinate Transforms for Monocular 3D Object Detection

Recognizing and localizing objects in the 3D space is a crucial ability ...

3 Li Wang, et al. ∙

research

∙ 07/29/2021

A Unified Efficient Pyramid Transformer for Semantic Segmentation

Semantic segmentation is a challenging problem due to difficulties in mo...

8 Fangrui Zhu, et al. ∙

research

∙ 06/21/2021

Dive into Deep Learning

This open-source book represents our attempt to make deep learning appro...

0 Aston Zhang, et al. ∙

research

∙ 02/04/2021

SelfNorm and CrossNorm for Out-of-Distribution Robustness

Normalization techniques are crucial in stabilizing and accelerating the...

12 Zhiqiang Tang, et al. ∙

research

∙ 11/06/2020

Improving Machine Reading Comprehension with Single-choice Decision and Transfer Learning

Multi-choice Machine Reading Comprehension (MMRC) aims to select the cor...

0 Yufan Jiang, et al. ∙

research

∙ 08/26/2020

FeatGraph: A Flexible and Efficient Backend for Graph Neural Network Systems

Graph neural networks (GNNs) are gaining increasing popularity as a prom...

0 Yuwei Hu, et al. ∙

research

∙ 07/26/2020

CSER: Communication-efficient SGD with Error Reset

The scalability of Distributed Stochastic Gradient Descent (SGD) is toda...

2 Cong Xie, et al. ∙

research

∙ 06/24/2020

Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes

BERT has recently attracted a lot of attention in natural language under...

0 Shuai Zheng, et al. ∙

research

∙ 06/04/2020

Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference

Modern deep neural networks increasingly make use of features such as dy...

0 Haichen Shen, et al. ∙

research

∙ 05/10/2020

Learning Context-Based Non-local Entropy Modeling for Image Compression

The entropy of the codes usually serves as the rate loss in the recent l...

2 Mu Li, et al. ∙

research

∙ 04/30/2020

Improving Semantic Segmentation via Self-Training

Deep learning usually achieves the best results with complete supervisio...

8 Yi Zhu, et al. ∙

research

∙ 04/19/2020

ResNeSt: Split-Attention Networks

While image classification models have recently continued to advance, mo...

16 Hang Zhang, et al. ∙

research

∙ 03/13/2020

AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data

We introduce AutoGluon-Tabular, an open-source AutoML framework that req...

4 Nick Erickson, et al. ∙

research

∙ 07/09/2019

GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing

We present GluonCV and GluonNLP, the deep learning toolkits for computer...

5 Jian Guo, et al. ∙

research

∙ 07/03/2019

A Unified Optimization Approach for CNN Model Inference on Integrated GPUs

Modern deep learning applications urge to push the model inference takin...

0 Leyuan Wang, et al. ∙

research

∙ 06/24/2019

Efficient and Effective Context-Based Convolutional Entropy Modeling for Image Compression

It has long been understood that precisely estimating the probabilistic ...

0 Mu Li, et al. ∙

research

∙ 04/26/2019

Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources

With an increasing demand for training powers for deep learning algorith...

12 Haibin Lin, et al. ∙

research

∙ 04/20/2019

Language Models with Transformers

The Transformer architecture is superior to RNN-based models in computat...

16 Chenguang Wang, et al. ∙

research

∙ 04/01/2019

Learning Content-Weighted Deep Image Compression

Learning-based lossy image compression usually involves the joint optimi...

10 Mu Li, et al. ∙

Mu Li

Featured Co-authors

Sign in with Google

Consider DeepAI Pro