Japanese Lexical Complexity for Non-Native Readers: A New Dataset

06/30/2023
by   Yusuke Ide, et al.
0

Lexical complexity prediction (LCP) is the task of predicting the complexity of words in a text on a continuous scale. It plays a vital role in simplifying or annotating complex words to assist readers. To study lexical complexity in Japanese, we construct the first Japanese LCP dataset. Our dataset provides separate complexity scores for Chinese/Korean annotators and others to address the readers' L1-specific needs. In the baseline experiment, we demonstrate the effectiveness of a BERT-based system for Japanese LCP.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2020

Detecting Multiword Expression Type Helps Lexical Complexity Assessment

Multiword expressions (MWEs) represent lexemes that should be treated as...
research
08/14/2017

Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks

We propose a method for embedding two-dimensional locations in a continu...
research
03/16/2020

CompLex — A New Corpus for Lexical Complexity Predicition from Likert Scale Data

Predicting which words are considered hard to understand for a given tar...
research
09/02/2020

An exploratory study of L1-specific non-words

In this paper, we explore L1-specific non-words, i.e. non-words in a tar...
research
02/23/2023

ProsAudit, a prosodic benchmark for self-supervised speech models

We present ProsAudit, a benchmark in English to assess structural prosod...
research
10/14/2020

Chinese Lexical Simplification

Lexical simplification has attracted much attention in many languages, w...
research
05/19/2021

Combining GCN and Transformer for Chinese Grammatical Error Detection

This paper describes our system at NLPTEA-2020 Task: Chinese Grammatical...

Please sign up or login with your details

Forgot password? Click here to reset