Analyzing Vietnamese Legal Questions Using Deep Neural Networks with Biaffine Classifiers

04/27/2023
by   Nguyen Anh Tu, et al.
0

In this paper, we propose using deep neural networks to extract important information from Vietnamese legal questions, a fundamental task towards building a question answering system in the legal domain. Given a legal question in natural language, the goal is to extract all the segments that contain the needed information to answer the question. We introduce a deep model that solves the task in three stages. First, our model leverages recent advanced autoencoding language models to produce contextual word embeddings, which are then combined with character-level and POS-tag information to form word representations. Next, bidirectional long short-term memory networks are employed to capture the relations among words and generate sentence-level representations. At the third stage, borrowing ideas from graph-based dependency parsing methods which provide a global view on the input sentence, we use biaffine classifiers to estimate the probability of each pair of start-end words to be an important segment. Experimental results on a public Vietnamese legal dataset show that our model outperforms the previous work by a large margin, achieving 94.79 effectiveness of using contextual features extracted from pre-trained language models combined with other types of features such as character-level and POS-tag features when training on a limited dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/16/2023

NOWJ1@ALQAC 2023: Enhancing Legal Task Performance with Classic Statistical Models and Pre-trained Language Models

This paper describes the NOWJ1 Team's approach for the Automated Legal Q...
research
07/02/2017

DAG-based Long Short-Term Memory for Neural Word Segmentation

Neural word segmentation has attracted more and more research interests ...
research
08/27/2018

An Investigation of the Interactions Between Pre-Trained Word Embeddings, Character Models and POS Tags in Dependency Parsing

We provide a comprehensive analysis of the interactions between pre-trai...
research
06/25/2021

ParaLaw Nets – Cross-lingual Sentence-level Pretraining for Legal Text Processing

Ambiguity is a characteristic of natural language, which makes expressio...
research
11/19/2021

Building a Question Answering System for the Manufacturing Domain

The design or simulation analysis of special equipment products must fol...
research
11/21/2018

Inline Detection of Domain Generation Algorithms with Context-Sensitive Word Embeddings

Domain generation algorithms (DGAs) are frequently employed by malware t...
research
01/07/2023

Graph-based Keyword Planning for Legal Clause Generation from Topics

Generating domain-specific content such as legal clauses based on minima...

Please sign up or login with your details

Forgot password? Click here to reset