Patent Claim Generation by Fine-Tuning OpenAI GPT-2

07/01/2019
by   Jieh-Sheng Lee, et al.
0

In this work, we focus on fine-tuning an OpenAI GPT-2 pre-trained model for generating patent claims. GPT-2 has demonstrated impressive efficacy of pre-trained language models on various tasks, particularly coherent text generation. Patent claim language itself has rarely been explored in the past and poses a unique challenge. We are motivated to generate coherent patent claims automatically so that augmented inventing might be viable someday. In our implementation, we identified a unique language structure in patent claims and leveraged its implicit human annotations. We investigated the fine-tuning process by probing the first 100 steps and observing the generated text at each step. Based on both conditional and unconditional random sampling, we analyze the overall quality of generated patent claims. Our contributions include: (1) being the first to generate patent claims by machines and being the first to apply GPT-2 to patent claim generation, (2) providing various experiment results for qualitative analysis and future research, (3) proposing a new sampling approach for text generation, and (4) building an e-mail bot for future researchers to explore the fine-tuned GPT-2 model further.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/26/2019

Measuring Patent Claim Generation by Span Relevancy

Our goal of patent claim generation is to realize "augmented inventing" ...
research
05/16/2019

IMHO Fine-Tuning Improves Claim Detection

Claims are the central component of an argument. Detecting claims across...
research
01/11/2020

PatentTransformer-2: Controlling Patent Text Generation by Structural Metadata

PatentTransformer is our codename for patent text generation based on Tr...
research
12/07/2019

Personalized Patent Claim Generation and Measurement

This work-in-progress paper proposes a framework to generate and measure...
research
11/22/2022

Linear Interpolation In Parameter Space is Good Enough for Fine-Tuned Language Models

The simplest way to obtain continuous interpolation between two points i...
research
05/15/2022

Mitigating Toxic Degeneration with Empathetic Data: Exploring the Relationship Between Toxicity and Empathy

Large pre-trained neural language models have supported the effectivenes...
research
03/22/2021

Hybrid Model for Patent Classification using Augmented SBERT and KNN

Purpose: This study aims to provide a hybrid approach for patent claim c...

Please sign up or login with your details

Forgot password? Click here to reset