Enhanced Integrated Scoring for Cleaning Dirty Texts

10/02/2008
by   Wilson Wong, et al.
0

An increasing number of approaches for ontology engineering from text are gearing towards the use of online sources such as company intranet and the World Wide Web. Despite such rise, not much work can be found in aspects of preprocessing and cleaning dirty texts from online sources. This paper presents an enhancement of an Integrated Scoring for Spelling error correction, Abbreviation expansion and Case restoration (ISSAC). ISSAC is implemented as part of a text preprocessing phase in an ontology engineering system. New evaluations performed on the enhanced ISSAC using 700 chat records reveal an improved accuracy of 98 basic ISSAC and of Aspell, respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/19/2018

Integrated Tools for Engineering Ontologies

The article presents an overview of current specialized ontology enginee...
research
05/26/2019

Evaluation of basic modules for isolated spelling error correction in Polish texts

Spelling error correction is an important problem in natural language pr...
research
08/06/2011

'Just Enough' Ontology Engineering

This paper introduces 'just enough' principles and 'systems engineering'...
research
01/04/2019

UTPO: User's Trust Profile Ontology - Modeling trust towards Online Health Information Sources

Despite the overwhelming quantity of health information that is availabl...
research
10/05/2022

Common Vulnerability Scoring System Prediction based on Open Source Intelligence Information Sources

The number of newly published vulnerabilities is constantly increasing. ...
research
07/19/2019

Fast Record Linkage for Company Entities

Record Linkage is an essential part of almost all real-world systems tha...

Please sign up or login with your details

Forgot password? Click here to reset