Distributional Framework for Emergent Knowledge Acquisition and its Application to Automated Document Annotation

10/11/2012
by   Vit Novacek, et al.
0

The paper introduces a framework for representation and acquisition of knowledge emerging from large samples of textual data. We utilise a tensor-based, distributional representation of simple statements extracted from text, and show how one can use the representation to infer emergent knowledge patterns from the textual data in an unsupervised manner. Examples of the patterns we investigate in the paper are implicit term relationships or conjunctive IF-THEN rules. To evaluate the practical relevance of our approach, we apply it to annotation of life science articles with terms from MeSH (a controlled biomedical vocabulary and thesaurus).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2017

Text Annotation Graphs: Annotating Complex Natural Language Phenomena

This paper introduces a new web-based software tool for annotating text,...
research
06/08/2018

Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora

Methods for unsupervised hypernym detection may broadly be categorized a...
research
12/03/2020

Ten Simple Rules for making a vocabulary FAIR

We present ten simple rules that support converting a legacy vocabulary ...
research
12/21/2020

A Distributional Approach to Controlled Text Generation

We propose a Distributional Approach to address Controlled Text Generati...
research
05/31/2023

Automated Annotation with Generative AI Requires Validation

Generative large language models (LLMs) can be a powerful tool for augme...
research
09/07/2021

ArGoT: A Glossary of Terms extracted from the arXiv

We introduce ArGoT, a data set of mathematical terms extracted from the ...
research
04/09/2015

RDF annotation of Second Life objects: Knowledge Representation meets Social Virtual reality

We have designed and implemented an application running inside Second Li...

Please sign up or login with your details

Forgot password? Click here to reset