Augmented Understanding and Automated Adaptation of Curation Rules

07/17/2020
by   Alireza Tabebordbar, et al.
0

Over the past years, there has been many efforts to curate and increase the added value of the raw data. Data curation has been defined as activities and processes an analyst undertakes to transform the raw data into contextualized data and knowledge. Data curation enables decision-makers and data analyst to extract value and derive insight from the raw data. However, to curate the raw data, an analyst needs to carry out various curation tasks including, extraction linking, classification, and indexing, which are error-prone, tedious and challenging. Besides, deriving insight require analysts to spend a long period of time to scan and analyze the curation environments. This problem is exacerbated when the curation environment is large, and the analyst needs to curate a varied and comprehensive list of data. To address these challenges, in this dissertation, we present techniques, algorithms and systems for augmenting analysts in curation tasks. We propose:  (1) a feature-based and automated technique for curating the raw data.  (2) We propose an autonomic approach for adapting data curation rules.  (3) We provide a solution to augment users in formulating their preferences while curating data in large scale information spaces.  (4) We implement a set of APIs for automating the basic curation tasks, including Named Entity extraction, POS tags, classification, and etc.

READ FULL TEXT
research
08/16/2013

Standardizing Interestingness Measures for Association Rules

Interestingness measures provide information that can be used to prune o...
research
12/10/2016

Data Curation APIs

Understanding and analyzing big data is firmly recognized as a powerful ...
research
01/03/2020

Information Extraction based on Named Entity for Tourism Corpus

Tourism information is scattered around nowadays. To search for the info...
research
10/24/2017

Automatic Generation of Benchmarks for Entity Recognition and Linking

The velocity dimension of Big Data plays an increasingly important role ...
research
01/10/2022

DeepKE: A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population

We present a new open-source and extensible knowledge extraction toolkit...
research
04/06/2023

TagGPT: Large Language Models are Zero-shot Multimodal Taggers

Tags are pivotal in facilitating the effective distribution of multimedi...
research
12/13/2018

Dynamic Transfer Learning for Named Entity Recognition

State-of-the-art named entity recognition (NER) systems have been improv...

Please sign up or login with your details

Forgot password? Click here to reset