DSGPT: Domain-Specific Generative Pre-Training of Transformers for Text Generation in E-commerce Title and Review Summarization

12/15/2021
by   Xueying Zhang, et al.
1

We propose a novel domain-specific generative pre-training (DS-GPT) method for text generation and apply it to the product titleand review summarization problems on E-commerce mobile display.First, we adopt a decoder-only transformer architecture, which fitswell for fine-tuning tasks by combining input and output all to-gether. Second, we demonstrate utilizing only small amount of pre-training data in related domains is powerful. Pre-training a languagemodel from a general corpus such as Wikipedia or the CommonCrawl requires tremendous time and resource commitment, andcan be wasteful if the downstream tasks are limited in variety. OurDSGPT is pre-trained on a limited dataset, the Chinese short textsummarization dataset (LCSTS). Third, our model does not requireproduct-related human-labeled data. For title summarization task,the state of art explicitly uses additional background knowledgein training and predicting stages. In contrast, our model implic-itly captures this knowledge and achieves significant improvementover other methods, after fine-tuning on the public Taobao.comdataset. For review summarization task, we utilize JD.com in-housedataset, and observe similar improvement over standard machinetranslation methods which lack the flexibility of fine-tuning. Ourproposed work can be simply extended to other domains for a widerange of text generation tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/10/2023

Enhancing Biomedical Text Summarization and Question-Answering: On the Utility of Domain-Specific Pre-Training

Biomedical summarization requires large datasets to train for text gener...
research
06/06/2022

Curriculum-Based Self-Training Makes Better Few-Shot Learners for Data-to-Text Generation

Despite the success of text-to-text pre-trained models in various natura...
research
04/11/2022

Towards Generalizable Semantic Product Search by Text Similarity Pre-training on Search Click Logs

Recently, semantic search has been successfully applied to e-commerce pr...
research
10/15/2021

Training Dynamics for Text Summarization Models

Pre-trained language models (e.g. BART) have shown impressive results wh...
research
08/15/2023

Handwritten Stenography Recognition and the LION Dataset

Purpose: In this paper, we establish a baseline for handwritten stenogra...
research
08/18/2023

A Methodology for Generative Spelling Correction via Natural Spelling Errors Emulation across Multiple Domains and Languages

Modern large language models demonstrate impressive capabilities in text...
research
09/07/2020

Black Box to White Box: Discover Model Characteristics Based on Strategic Probing

In Machine Learning, White Box Adversarial Attacks rely on knowing under...

Please sign up or login with your details

Forgot password? Click here to reset