An Empirical Study for Vietnamese Constituency Parsing with Pre-training

10/19/2020
by   Tuan-Vi Tran, et al.
0

In this work, we use a span-based approach for Vietnamese constituency parsing. Our method follows the self-attention encoder architecture and a chart decoder using a CKY-style inference algorithm. We present analyses of the experiment results of the comparison of our empirical method using pre-training models XLM-Roberta and PhoBERT on both Vietnamese datasets VietTreebank and NIIVTB1. The results show that our model with XLM-Roberta archived the significantly F1-score better than other pre-training models, VietTreebank at 81.19

READ FULL TEXT

page 1

page 4

research
12/31/2018

Multilingual Constituency Parsing with Self-Attention and Pre-Training

We extend our previous work on constituency parsing (Kitaev and Klein, 2...
research
12/08/2021

MLP Architectures for Vision-and-Language Modeling: An Empirical Study

We initiate the first empirical study on the use of MLP architectures fo...
research
01/02/2021

Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting

In this paper, we generalize text infilling (e.g., masked language model...
research
09/11/2023

Improving Information Extraction on Business Documents with Specific Pre-Training Tasks

Transformer-based Language Models are widely used in Natural Language Pr...
research
05/21/2019

Generating Logical Forms from Graph Representations of Text and Entities

Structured information about entities is critical for many semantic pars...
research
07/29/2023

JFinder: A Novel Architecture for Java Vulnerability Identification Based Quad Self-Attention and Pre-training Mechanism

Software vulnerabilities pose significant risks to computer systems, imp...
research
11/03/2016

An empirical study for Vietnamese dependency parsing

This paper presents an empirical comparison of different dependency pars...

Please sign up or login with your details

Forgot password? Click here to reset