PatentMatch: A Dataset for Matching Patent Claims Prior Art

12/27/2020
by   Julian Risch, et al.
0

Patent examiners need to solve a complex information retrieval task when they assess the novelty and inventive step of claims made in a patent application. Given a claim, they search for prior art, which comprises all relevant publicly available information. This time-consuming task requires a deep understanding of the respective technical domain and the patent-domain-specific language. For these reasons, we address the computer-assisted search for prior art by creating a training dataset for supervised machine learning called PatentMatch. It contains pairs of claims from patent applications and semantically corresponding text passages of different degrees from cited patent documents. Each pair has been labeled by technically-skilled patent examiners from the European Patent Office. Accordingly, the label indicates the degree of semantic correspondence (matching), i.e., whether the text passage is prejudicial to the novelty of the claimed invention or not. Preliminary experiments using a baseline system show that PatentMatch can indeed be used for training a binary text pair classifier on this challenging information retrieval task. The dataset is available online: https://hpi.de/naumann/s/patentmatch.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/18/2022

RELIC: Retrieving Evidence for Literary Claims

Humanities scholars commonly provide evidence for claims that they make ...
research
12/02/2022

Information Retrieval from the Digitized Books

Extracting the relevant information out of a large number of documents i...
research
03/01/2021

BERT based patent novelty search by training claims to their own description

In this paper we present a method to concatenate patent claims to their ...
research
01/10/2019

Automating the search for a patent's prior art with a full text similarity search

More than ever, technical inventions are the symbol of our society's adv...
research
05/26/2023

DataFinder: Scientific Dataset Recommendation from Natural Language Descriptions

Modern machine learning relies on datasets to develop and validate resea...
research
08/16/2021

Toward the Understanding of Deep Text Matching Models for Information Retrieval

Semantic text matching is a critical problem in information retrieval. R...
research
05/26/2023

To Revise or Not to Revise: Learning to Detect Improvable Claims for Argumentative Writing Support

Optimizing the phrasing of argumentative text is crucial in higher educa...

Please sign up or login with your details

Forgot password? Click here to reset