Restoring Hebrew Diacritics Without a Dictionary

05/11/2021
by   Elazar Gershuni, et al.
0

We demonstrate that it is feasible to diacritize Hebrew script without any human-curated resources other than plain diacritized text. We present NAKDIMON, a two-layer character level LSTM, that performs on par with much more complicated curation-dependent systems, across a diverse array of modern Hebrew sources.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2016

Recurrent Memory Array Structures

The following report introduces ideas augmenting standard Long Short Ter...
research
06/28/2018

Rich Character-Level Information for Korean Morphological Analysis and Part-of-Speech Tagging

Due to the fact that Korean is a highly agglutinative, character-rich la...
research
01/11/2020

Authorship Attribution in Bangla literature using Character-level CNN

Characters are the smallest unit of text that can extract stylometric si...
research
07/03/2017

Multiscale sequence modeling with a learned dictionary

We propose a generalization of neural network sequence models. Instead o...
research
04/16/2020

Investigating Efficient Learning and Compositionality in Generative LSTM Networks

When comparing human with artificial intelligence, one major difference ...
research
06/02/2021

BERT-Defense: A Probabilistic Model Based on BERT to Combat Cognitively Inspired Orthographic Adversarial Attacks

Adversarial attacks expose important blind spots of deep learning system...
research
10/23/2018

Ain't Nobody Got Time For Coding: Structure-Aware Program Synthesis From Natural Language

Program synthesis from natural language (NL) is practical for humans and...

Please sign up or login with your details

Forgot password? Click here to reset