CaseEncoder: A Knowledge-enhanced Pre-trained Model for Legal Case Encoding

05/09/2023
by   Yixiao Ma, et al.
0

Legal case retrieval is a critical process for modern legal information systems. While recent studies have utilized pre-trained language models (PLMs) based on the general domain self-supervised pre-training paradigm to build models for legal case retrieval, there are limitations in using general domain PLMs as backbones. Specifically, these models may not fully capture the underlying legal features in legal case documents. To address this issue, we propose CaseEncoder, a legal document encoder that leverages fine-grained legal knowledge in both the data sampling and pre-training phases. In the data sampling phase, we enhance the quality of the training data by utilizing fine-grained law article information to guide the selection of positive and negative examples. In the pre-training phase, we design legal-specific pre-training tasks that align with the judging criteria of relevant legal cases. Based on these tasks, we introduce an innovative loss function called Biased Circle Loss to enhance the model's ability to recognize case relevance in fine grains. Experimental results on multiple benchmarks demonstrate that CaseEncoder significantly outperforms both existing general pre-training models and legal-specific pre-training models in zero-shot legal case retrieval.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/22/2023

SAILER: Structure-aware Pre-trained Language Model for Legal Case Retrieval

Legal case retrieval, which aims to find relevant cases for a query case...
research
05/11/2023

THUIR@COLIEE 2023: Incorporating Structural Knowledge into Pre-trained Language Models for Legal Case Retrieval

Legal case retrieval techniques play an essential role in modern intelli...
research
11/05/2022

Privacy-Preserving Models for Legal Natural Language Processing

Pre-training large transformer models with in-domain data improves domai...
research
09/14/2021

Legal Transformer Models May Not Always Help

Deep learning-based Natural Language Processing methods, especially tran...
research
02/01/2023

Zero-shot Transfer of Article-aware Legal Outcome Classification for European Court of Human Rights Cases

In this paper, we cast Legal Judgment Prediction on European Court of Hu...
research
05/06/2022

Fine-grained Intent Classification in the Legal Domain

A law practitioner has to go through a lot of long legal case proceeding...
research
11/10/2019

Searching for Legal Clauses by Analogy. Few-shot Semantic Retrieval Shared Task

We introduce a novel shared task for semantic retrieval from legal texts...

Please sign up or login with your details

Forgot password? Click here to reset