b'Shuming Shi'

research

∙ 09/11/2023

TeGit: Generating High-Quality Instruction-Tuning Data with Text-Grounded Task Design

High-quality instruction-tuning data is critical to improving LLM capabi...

0 Yongrui Chen, et al. ∙

research

∙ 09/03/2023

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

While large language models (LLMs) have demonstrated remarkable capabili...

0 Yue Zhang, et al. ∙

research

∙ 08/12/2023

GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher

Safety lies at the core of the development of Large Language Models (LLM...

0 Youliang Yuan, et al. ∙

research

∙ 07/16/2023

Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language Modelling

Modeling discourse – the linguistic phenomena that go beyond individual ...

0 Longyue Wang, et al. ∙

research

∙ 07/06/2023

On the Cultural Gap in Text-to-Image Generation

One challenge in text-to-image (T2I) generation is the inadvertent refle...

0 Bingshuai Liu, et al. ∙

research

∙ 06/28/2023

SkillNet-X: A Multilingual Multitask Model with Sparsely Activated Skills

Traditional multitask learning methods basically can only exploit common...

0 Zhangyin Feng, et al. ∙

research

∙ 06/20/2023

Explicit Syntactic Guidance for Neural Text Generation

Most existing text generation models follow the sequence-to-sequence par...

0 Yafu Li, et al. ∙

research

∙ 06/15/2023

Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration

Although instruction-tuned large language models (LLMs) have exhibited r...

0 Chenyang Lyu, et al. ∙

research

∙ 06/04/2023

Sen2Pro: A Probabilistic Perspective to Sentence Embedding from Pre-trained Language Model

Sentence embedding is one of the most fundamental tasks in Natural Langu...

0 Lingfeng Shen, et al. ∙

research

∙ 05/26/2023

Improved Visual Story Generation with Adaptive Context Modeling

Diffusion models developed on top of powerful text-to-image generation m...

0 Zhangyin Feng, et al. ∙

research

∙ 05/22/2023

Deepfake Text Detection in the Wild

Recent advances in large language models have enabled them to reach a le...

0 Yafu Li, et al. ∙

research

∙ 05/22/2023

A Frustratingly Simple Decoding Method for Neural Text Generation

We introduce a frustratingly simple, super efficient and surprisingly ef...

0 Haoran Yang, et al. ∙

research

∙ 05/17/2023

A Survey on Zero Pronoun Translation

Zero pronouns (ZPs) are frequently omitted in pro-drop languages (e.g. C...

0 Longyue Wang, et al. ∙

research

∙ 05/13/2023

A Simple and Plug-and-play Method for Unsupervised Sentence Representation Enhancement

Generating proper embedding of sentences through an unsupervised way is ...

0 Lingfeng Shen, et al. ∙

research

∙ 04/05/2023

ParroT: Translating During Chat Using Large Language Models

Large language models (LLMs) like ChatGPT and GPT-4 have exhibited remar...

0 Wenxiang Jiao, et al. ∙

research

∙ 04/05/2023

Document-Level Machine Translation with Large Language Models

Large language models (LLMs) such as Chat-GPT can produce coherent, cohe...

0 Longyue Wang, et al. ∙

research

∙ 03/23/2023

Is ChatGPT A Good Keyphrase Generator? A Preliminary Study

The emergence of ChatGPT has recently garnered significant attention fro...

0 Mingyang Song, et al. ∙

research

∙ 12/02/2022

Zero-Shot Rumor Detection with Propagation Structure via Prompt Learning

The spread of rumors along with breaking events seriously hinders the tr...

0 Hongzhan Lin, et al. ∙

research

∙ 08/03/2022

Effidit: Your AI Writing Assistant

In this technical report, we introduce Effidit (Efficient and Intelligen...

3 Shuming Shi, et al. ∙

research

∙ 05/12/2022

One Model, Multiple Modalities: A Sparsely Activated Approach for Text, Sound, Image, Video and Code

People perceive the world with multiple senses (e.g., through hearing so...

9 Yong Dai, et al. ∙

research

∙ 04/26/2022

SkillNet-NLG: General-Purpose Natural Language Generation with a Sparsely Activated Approach

We present SkillNet-NLG, a sparsely activated approach that handles many...

0 Junwei Liao, et al. ∙

research

∙ 04/26/2022

Pretraining Chinese BERT for Detecting Word Insertion and Deletion Errors

Chinese BERT models achieve remarkable progress in dealing with grammati...

1 Cong Zhou, et al. ∙

research

∙ 04/21/2022

A Model-Agnostic Data Manipulation Method for Persona-based Dialogue Generation

Towards building intelligent dialogue agents, there has been a growing i...

0 Yu Cao, et al. ∙

research

∙ 03/29/2022

Investigating Data Variance in Evaluations of Automatic Machine Translation Metrics

Current practices in metric evaluation focus on one single dataset, e.g....

1 Jiannan Xiang, et al. ∙

research

∙ 03/16/2022

Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation

In this paper, we present a substantial step in better understanding the...

0 Wenxuan Wang, et al. ∙

research

∙ 03/12/2022

MarkBERT: Marking Word Boundaries Improves Chinese BERT

We present a Chinese BERT model dubbed MarkBERT that uses word informati...

12 Linyang Li, et al. ∙

research

∙ 03/09/2022

Efficient Sub-structured Knowledge Distillation

Structured prediction models aim at solving a type of problem where the ...

0 Wenye Lin, et al. ∙

research

∙ 03/07/2022

One Model, Multiple Tasks: Pathways for Natural Language Understanding

This paper presents a Pathways approach to handle many tasks at once. Ou...

1 Duyu Tang, et al. ∙

research

∙ 03/01/2022

Exploring and Adapting Chinese GPT to Pinyin Input Method

While GPT has become the de-facto method for text generation tasks, its ...

0 Minghuan Tan, et al. ∙

research

∙ 02/24/2022

Pretraining without Wordpieces: Learning Over a Vocabulary of Millions of Words

The standard BERT adopts subword-based tokenization, which may break a w...

15 Zhangyin Feng, et al. ∙

research

∙ 02/17/2022

Revisiting the Evaluation Metrics of Paraphrase Generation

Paraphrase generation is an important NLP task that has achieved signifi...

0 Lingfeng Shen, et al. ∙

research

∙ 01/09/2022

Rethink Stealthy Backdoor Attacks in Natural Language Processing

Recently, it has been shown that natural language processing (NLP) model...

0 Lingfeng Shen, et al. ∙

research

∙ 10/05/2021

On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation

Pre-training (PT) and back-translation (BT) are two simple and powerful ...

0 Xuebo Liu, et al. ∙

research

∙ 08/26/2021

Rethinking Negative Sampling for Unlabeled Entity Problem in Named Entity Recognition

In many situations (e.g., distant supervision), unlabeled entity problem...

0 Yangming Li, et al. ∙

research

∙ 07/17/2021

On the Copying Behaviors of Pre-Training for Neural Machine Translation

Previous studies have shown that initializing neural machine translation...

0 Xuebo Liu, et al. ∙

research

∙ 06/03/2021

Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical Error Correction

We investigate the problem of Chinese Grammatical Error Correction (CGEC...

4 Piji Li, et al. ∙

research

∙ 06/02/2021

Self-Training Sampling with Monolingual Data Uncertainty for Neural Machine Translation

Self-training has proven effective for improving NMT performance by augm...

0 Wenxiang Jiao, et al. ∙

research

∙ 05/31/2021

GWLAN: General Word-Level AutocompletioN for Computer-Aided Translation

Computer-aided translation (CAT), the use of software to assist a human ...

5 Huayang Li, et al. ∙

research

∙ 05/30/2021

REAM♯: An Enhancement Approach to Reference-based Evaluation Metrics for Open-domain Dialog Generation

The lack of reliable automatic evaluation metrics is a major impediment ...

0 Jun Gao, et al. ∙

research

∙ 05/27/2021

TranSmart: A Practical Interactive Machine Translation System

Automatic machine translation is super efficient to produce translations...

7 Guoping Huang, et al. ∙

research

∙ 12/31/2020

TexSmart: A Text Understanding System for Fine-Grained NER and Enhanced Semantic Analysis

This technique report introduces TexSmart, a text understanding system t...

10 Haisong Zhang, et al. ∙

research

∙ 12/29/2020

Dialogue Response Selection with Hierarchical Curriculum Learning

We study the learning of a matching model for dialogue response selectio...

4 Yixuan Su, et al. ∙

research

∙ 12/17/2020

Predicting Events in MOBA Games: Dataset, Attribution, and Evaluation

The multiplayer online battle arena (MOBA) games have become increasingl...

14 Zelong Yang, et al. ∙

research

∙ 12/10/2020

Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition

In many scenarios, named entity recognition (NER) models severely suffer...

0 Yangming Li, et al. ∙

research

∙ 12/10/2020

Segmenting Natural Language Sentences via Lexical Unit Analysis

In this work, we present Lexical Unit Analysis (LUA), a framework for ge...

0 Yangming Li, et al. ∙

research

∙ 10/10/2020

When Hearst Is not Enough: Improving Hypernymy Detection from Corpus with Distributional Models

We address hypernymy detection, i.e., whether an is-a relationship exist...

0 Changlong Yu, et al. ∙

research

∙ 10/06/2020

On the Sub-Layer Functionalities of Transformer Decoder

There have been significant efforts to interpret the encoder of Transfor...

8 Yilin Yang, et al. ∙

research

∙ 10/06/2020

On the Branching Bias of Syntax Extracted from Pre-trained Language Models

Many efforts have been devoted to extracting constituency trees from pre...

0 Huayang Li, et al. ∙

research

∙ 08/14/2020

Interpretable Real-Time Win Prediction for Honor of Kings, a Popular Mobile MOBA Esport

With the rapid prevalence and explosive development of MOBA esports (Mul...

0 Zelong Yang, et al. ∙

research

∙ 05/04/2020

Evaluating Explanation Methods for Neural Machine Translation

Recently many efforts have been devoted to interpreting the black-box NM...

0 Jierui Li, et al. ∙

Shuming Shi

Featured Co-authors

Sign in with Google

Consider DeepAI Pro