b'Liang Ding'

research

∙ 08/30/2023

MerA: Merging Pretrained Adapters For Few-Shot Learning

Adapter tuning, which updates only a few parameters, has become a mainst...

0 Shwai He, et al. ∙

research

∙ 08/29/2023

Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models

Most open-domain dialogue systems suffer from forgetting important infor...

0 Qingyue Wang, et al. ∙

research

∙ 08/24/2023

Can Linguistic Knowledge Improve Multimodal Alignment in Vision-Language Pretraining?

The multimedia community has shown a significant interest in perceiving ...

0 Fei Wang, et al. ∙

research

∙ 07/30/2023

Efficient Federated Learning via Local Adaptive Amended Optimizer with Linear Speedup

Adaptive optimization has achieved notable success for distributed learn...

0 Yan Sun, et al. ∙

research

∙ 07/13/2023

Free-Form Composition Networks for Egocentric Action Recognition

Egocentric action recognition is gaining significant attention in the fi...

0 Haoran Wang, et al. ∙

research

∙ 06/05/2023

Unsupervised Dense Retrieval with Relevance-Aware Contrastive Pre-Training

Dense retrievers have achieved impressive performance, but their demand ...

0 Yibin Lei, et al. ∙

research

∙ 06/01/2023

Divide, Conquer, and Combine: Mixture of Semantic-Independent Experts for Zero-Shot Dialogue State Tracking

Zero-shot transfer learning for Dialogue State Tracking (DST) helps to h...

0 Qingyue Wang, et al. ∙

research

∙ 05/24/2023

Self-Evolution Learning for Discriminative Language Model Pretraining

Masked language modeling, widely used in discriminative language model (...

0 Qihuang Zhong, et al. ∙

research

∙ 05/24/2023

Revisiting Token Dropping Strategy in Efficient BERT Pretraining

Token dropping is a recently-proposed strategy to speed up the pretraini...

0 Qihuang Zhong, et al. ∙

research

∙ 05/22/2023

Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot Text Classification Tasks

Text classification tasks often encounter few shot scenarios with limite...

0 Haoqi Zheng, et al. ∙

research

∙ 05/19/2023

Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape

In federated learning (FL), a cluster of local clients are chaired under...

0 Yan Sun, et al. ∙

research

∙ 05/05/2023

Random Smoothing Regularization in Kernel Gradient Descent Learning

Random smoothing data augmentation is a unique form of regularization th...

0 Liang Ding, et al. ∙

research

∙ 04/29/2023

Representing Additive Gaussian Processes by Sparse Matrices

Among generalized additive models, additive Matérn Gaussian Processes (G...

0 Lu Zou, et al. ∙

research

∙ 04/20/2023

Prompt-Learning for Cross-Lingual Relation Extraction

Relation Extraction (RE) is a crucial task in Information Extraction, wh...

0 Chiaming Hsu, et al. ∙

research

∙ 04/07/2023

On Efficient Training of Large-Scale Deep Learning Models: A Literature Review

The field of deep learning has witnessed significant progress, particula...

0 Li Shen, et al. ∙

research

∙ 03/24/2023

Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models: A Case Study on ChatGPT

Generative large language models (LLMs), e.g., ChatGPT, have demonstrate...

0 Qingyu Lu, et al. ∙

research

∙ 03/24/2023

Towards Making the Most of ChatGPT for Machine Translation

ChatGPT shows remarkable capabilities for machine translation (MT). Seve...

0 Keqin Peng, et al. ∙

research

∙ 03/01/2023

AdaSAM: Boosting Sharpness-Aware Minimization with Adaptive Learning Rate and Momentum for Training Deep Neural Networks

Sharpness aware minimization (SAM) optimizer has been extensively explor...

0 Hao Sun, et al. ∙

research

∙ 02/21/2023

FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy

Federated learning is an emerging distributed machine learning framework...

0 Yan Sun, et al. ∙

research

∙ 02/19/2023

Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT

Recently, ChatGPT has attracted great attention, as it can generate flue...

25 Qihuang Zhong, et al. ∙

research

∙ 02/18/2023

Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE

This technical report briefly describes our JDExplore d-team's submissio...

15 Qihuang Zhong, et al. ∙

research

∙ 01/25/2023

mcGP: mesh-clustered Gaussian process emulator for partial differential equation systems

Partial differential equations (PDEs) have become an essential tool for ...

0 Chih-Li Sung, et al. ∙

research

∙ 12/20/2022

Original or Translated? On the Use of Parallel Data for Translation Quality Estimation

Machine Translation Quality Estimation (QE) is the task of evaluating tr...

0 Baopu Qiu, et al. ∙

research

∙ 12/20/2022

Toward Human-Like Evaluation for Natural Language Generation with Error Analysis

The state-of-the-art language model-based automatic metrics, e.g. BARTSc...

0 Qingyu Lu, et al. ∙

research

∙ 12/04/2022

Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE

This technical report briefly describes our JDExplore d-team's Vega v2 s...

0 Qihuang Zhong, et al. ∙

research

∙ 12/02/2022

Improving Simultaneous Machine Translation with Monolingual Data

Simultaneous machine translation (SiMT) is usually done via sequence-lev...

0 Hexuan Deng, et al. ∙

research

∙ 11/10/2022

Cherry Hypothesis: Identifying the Cherry on the Cake for Dynamic Networks

Dynamic networks have been extensively explored as they can considerably...

0 Shwai He, et al. ∙

research

∙ 10/11/2022

Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models

Fine-tuning large pretrained language models on a limited training corpu...

7 Qihuang Zhong, et al. ∙

research

∙ 10/09/2022

SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters

Adapter Tuning, which freezes the pretrained language models (PLMs) and ...

5 Shwai He, et al. ∙

research

∙ 09/20/2022

Vega-MT: The JD Explore Academy Translation System for WMT22

We describe the JD Explore Academy's submission of the WMT 2022 shared g...

2 Changtong Zan, et al. ∙

research

∙ 09/07/2022

On the Complementarity between Pre-Training and Random-Initialization for Resource-Rich Machine Translation

Pre-Training (PT) of text representations has been successfully applied ...

2 Changtong Zan, et al. ∙

research

∙ 08/22/2022

PANDA: Prompt Transfer Meets Knowledge Distillation for Efficient Model Adaptation

Prompt-tuning, which freezes pretrained language models (PLMs) and only ...

17 Qihuang Zhong, et al. ∙

research

∙ 07/18/2022

Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks

Graph Neural Networks (GNNs) tend to suffer from high computation costs ...

4 Chuang Liu, et al. ∙

research

∙ 05/30/2022

E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation

Sequence-to-sequence (seq2seq) learning has become a popular trend for p...

1 Qihuang Zhong, et al. ∙

research

∙ 05/28/2022

Parameter-Efficient and Student-Friendly Knowledge Distillation

Knowledge distillation (KD) has been extensively employed to transfer th...

9 Jun Rao, et al. ∙

research

∙ 05/22/2022

Interpretable Proof Generation via Iterative Backward Reasoning

We present IBR, an Iterative Backward Reasoning model to solve the proof...

3 Hanhao Qu, et al. ∙

research

∙ 04/16/2022

BLISS: Robust Sequence-to-Sequence Learning via Self-Supervised Input Representation

Data augmentations (DA) are the cores to achieving robust sequence-to-se...

0 Zheng Zhang, et al. ∙

research

∙ 04/16/2022

Bridging Cross-Lingual Gaps During Leveraging the Multilingual Sequence-to-Sequence Pretraining for Text Generation

For multilingual sequence-to-sequence pretrained language models (multil...

0 Changtong Zan, et al. ∙

research

∙ 04/16/2022

A Contrastive Cross-Channel Data Augmentation Framework for Aspect-based Sentiment Analysis

Aspect-Based Sentiment Analysis is a fine-grained sentiment analysis tas...

0 Bing Wang, et al. ∙

research

∙ 03/17/2022

Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning

Federated Learning (FL) is an emerging distributed learning paradigm und...

0 Lin Zhang, et al. ∙

research

∙ 03/08/2022

Where Does the Performance Improvement Come From? – A Reproducibility Concern about Image-Text Retrieval

This paper seeks to provide the information retrieval community with som...

0 Jun Rao, et al. ∙

research

∙ 03/07/2022

Kernel Packet: An Exact and Scalable Algorithm for Gaussian Process Regression with Matérn Correlations

We develop an exact and scalable algorithm for one-dimensional Gaussian ...

0 Haoyuan Chen, et al. ∙

research

∙ 01/19/2022

Improving Neural Machine Translation by Denoising Training

We present a simple and effective pretraining strategy Denoising Trainin...

0 Liang Ding, et al. ∙

research

∙ 01/13/2022

Knowledge Graph Augmented Network Towards Multiview Representation Learning for Aspect-based Sentiment Analysis

Aspect-based sentiment analysis (ABSA) is a fine-grained task of sentime...

0 Qihuang Zhong, et al. ∙

research

∙ 12/11/2021

A Sparse Expansion For Deep Gaussian Processes

Deep Gaussian Processes (DGP) enable a non-parametric approach to quanti...

0 Liang Ding, et al. ∙

research

∙ 10/26/2021

Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis

Aspect-based Sentiment Analysis (ABSA) aims to determine the sentiment p...

0 Juhua Liu, et al. ∙

research

∙ 10/05/2021

On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation

Pre-training (PT) and back-translation (BT) are two simple and powerful ...

0 Xuebo Liu, et al. ∙

research

∙ 09/16/2021

Improving Neural Machine Translation by Bidirectional Training

We present a simple and effective pretraining strategy – bidirectional t...

0 Liang Ding, et al. ∙

research

∙ 07/24/2021

The USYD-JD Speech Translation System for IWSLT 2021

This paper describes the University of Sydney JD's joint submission o...

0 Liang Ding, et al. ∙

research

∙ 07/19/2021

High-Dimensional Simulation Optimization via Brownian Fields and Sparse Grids

High-dimensional simulation optimization is notoriously challenging. We ...

0 Liang Ding, et al. ∙

Liang Ding

Featured Co-authors

Sign in with Google

Consider DeepAI Pro