Towards Semantically Enhanced Data Understanding

06/13/2018
by   Markus Schröder, et al.
0

In the field of machine learning, data understanding is the practice of getting initial insights in unknown datasets. Such knowledge-intensive tasks require a lot of documentation, which is necessary for data scientists to grasp the meaning of the data. Usually, documentation is separate from the data in various external documents, diagrams, spreadsheets and tools which causes considerable look up overhead. Moreover, other supporting applications are not able to consume and utilize such unstructured data. That is why we propose a methodology that uses a single semantic model that interlinks data with its documentation. Hence, data scientists are able to directly look up the connected information about the data by simply following links. Equally, they can browse the documentation which always refers to the data. Furthermore, the model can be used by other approaches providing additional support, like searching, comparing, integrating or visualizing data. To showcase our approach we also demonstrate an early prototype.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2022

Data+Shift: Supporting visual investigation of data distribution shifts by data scientists

Machine learning on data streams is increasingly more present in multipl...
research
12/15/2021

EDAssistant: Supporting Exploratory Data Analysis in Computational Notebooks with In-Situ Code Search and Recommendation

Using computational notebooks (e.g., Jupyter Notebook), data scientists ...
research
07/10/2023

Model-Driven Engineering Method to Support the Formalization of Machine Learning using SysML

Methods: This work introduces a method supporting the collaborative defi...
research
04/11/2018

A web service based on RESTful API and JSON Schema/JSON Meta Schema to construct knowledge graphs

Data visualisation assists domain experts in understanding their data an...
research
04/09/2021

Model LineUpper: Supporting Interactive Model Comparison at Multiple Levels for AutoML

Automated Machine Learning (AutoML) is a rapidly growing set of technolo...
research
09/13/2022

FEDEX: An Explainability Framework for Data Exploration Steps

When exploring a new dataset, Data Scientists often apply analysis queri...

Please sign up or login with your details

Forgot password? Click here to reset