Comprehending Semantic Types in JSON Data with Graph Neural Networks

07/24/2023
by   Shuang Wei, et al.
0

Semantic types are a more powerful and detailed way of describing data than atomic types such as strings or integers. They establish connections between columns and concepts from the real world, providing more nuanced and fine-grained information that can be useful for tasks such as automated data cleaning, schema matching, and data discovery. Existing deep learning models trained on large text corpora have been successful at performing single-column semantic type prediction for relational data. However, in this work, we propose an extension of the semantic type prediction problem to JSON data, labeling the types based on JSON Paths. Similar to columns in relational data, JSON Path is a query language that enables the navigation of complex JSON data structures by specifying the location and content of the elements. We use a graph neural network to comprehend the structural information within collections of JSON documents. Our model outperforms a state-of-the-art existing model in several cases. These results demonstrate the ability of our model to understand complex JSON data and its potential usage for JSON-related data processing tasks.

READ FULL TEXT
research
05/25/2019

Sherlock: A Deep Learning Approach to Semantic Data Type Detection

Correctly detecting the semantic type of data columns is crucial for dat...
research
11/14/2019

Sato: Contextual Semantic Type Detection in Tables

Detecting the semantic types of data columns in relational tables is imp...
research
06/24/2021

DCoM: A Deep Column Mapper for Semantic Data Type Detection

Detection of semantic data types is a very crucial task in data science ...
research
06/01/2023

Column Type Annotation using ChatGPT

Column type annotation is the task of annotating the columns of a relati...
research
11/06/2020

Learning with Molecules beyond Graph Neural Networks

We demonstrate a deep learning framework which is inherently based in th...
research
06/17/2020

Canonicalizing Open Knowledge Bases with Multi-Layered Meta-Graph Neural Network

Noun phrases and relational phrases in Open Knowledge Bases are often no...
research
06/06/2021

TabularNet: A Neural Network Architecture for Understanding Semantic Structures of Tabular Data

Tabular data are ubiquitous for the widespread applications of tables an...

Please sign up or login with your details

Forgot password? Click here to reset