Columnar Database Techniques for Creating AI Features

12/07/2017
by   Brad Carlile, et al.
0

Recent advances with in-memory columnar database techniques have increased the performance of analytical queries on very large databases and data warehouses. At the same time, advances in artificial intelligence (AI) algorithms have increased the ability to analyze data. We use the term AI to encompass both Deep Learning (DL or neural network) and Machine Learning (ML aka Big Data analytics). Our exploration of the AI full stack has led us to a cross-stack columnar database innovation that efficiently creates features for AI analytics. The innovation is to create Augmented Dictionary Values (ADVs) to add to existing columnar database dictionaries in order to increase the efficiency of featurization by minimizing data movement and data duplication. We show how various forms of featurization (feature selection, feature extraction, and feature creation) can be efficiently calculated in a columnar database. The full stack AI investigation has also led us to propose an integrated columnar database and AI architecture. This architecture has information flows and feedback loops to improve the whole analytics cycle during multiple iterations of extracting data from the data sources, featurization, and analysis.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/18/2019

Advances in Big Data Bio Analytics

Delivering effective data analytics is of crucial importance to the inte...
research
05/19/2022

Deep Learning in Business Analytics: A Clash of Expectations and Reality

Our fast-paced digital economy shaped by global competition requires inc...
research
08/25/2023

AI in Thyroid Cancer Diagnosis: Techniques, Trends, and Future Directions

There has been a growing interest in creating intelligent diagnostic sys...
research
01/30/2022

A Systematic Literature Review about Idea Mining: The Use of Machine-driven Analytics to Generate Ideas

Idea generation is the core activity of innovation. Digital data sources...
research
06/15/2023

Prevention of cyberattacks in WSN and packet drop by CI framework and information processing protocol using AI and Big Data

As the reliance on wireless sensor networks (WSNs) rises in numerous sec...
research
01/13/2020

Towards Automated Swimming Analytics Using Deep Neural Networks

Methods for creating a system to automate the collection of swimming ana...

Please sign up or login with your details

Forgot password? Click here to reset