DeepAI AI Chat
Log In Sign Up

Manually Annotated Spelling Error Corpus for Amharic

This paper presents a manually annotated spelling error corpus for Amharic, lingua franca in Ethiopia. The corpus is designed to be used for the evaluation of spelling error detection and correction. The misspellings are tagged as non-word and real-word errors. In addition, the contextual information available in the corpus makes it useful in dealing with both types of spelling errors.


page 1

page 2

page 3


UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language

We present a corpus professionally annotated for grammatical error corre...

HORAE: an annotated dataset of books of hours

We introduce in this paper a new dataset of annotated pages from books o...

Finnish Paraphrase Corpus

In this paper, we introduce the first fully manually annotated paraphras...

How big is big enough? Unsupervised word sense disambiguation using a very large corpus

In this paper, the problem of disambiguating a target word for Polish is...

ViS-Á-ViS : Detecting Similar Patterns in Annotated Literary Text

We present a web-based system called ViS-Á-ViS aiming to assist literary...

Publishing a Quality Context-aware Annotated Corpus and Lexicon for Harassment Research

Having a quality annotated corpus is essential especially for applied re...

Artificial Error Generation with Machine Translation and Syntactic Patterns

Shortage of available training data is holding back progress in the area...