A Big Data Analysis Framework Using Apache Spark and Deep Learning

11/25/2017
by   Anand Gupta, et al.
0

With the spreading prevalence of Big Data, many advances have recently been made in this field. Frameworks such as Apache Hadoop and Apache Spark have gained a lot of traction over the past decades and have become massively popular, especially in industries. It is becoming increasingly evident that effective big data analysis is key to solving artificial intelligence problems. Thus, a multi-algorithm library was implemented in the Spark framework, called MLlib. While this library supports multiple machine learning algorithms, there is still scope to use the Spark setup efficiently for highly time-intensive and computationally expensive procedures like deep learning. In this paper, we propose a novel framework that combines the distributive computational abilities of Apache Spark and the advanced machine learning architecture of a deep multi-layer perceptron (MLP), using the popular concept of Cascade Learning. We conduct empirical analysis of our framework on two real world datasets. The results are encouraging and corroborate our proposed framework, in turn proving that it is an improvement over traditional big data analysis methods that use either Spark or Deep learning as individual elements.

READ FULL TEXT
research
08/02/2018

Mobile big data analysis with machine learning

This paper investigates to identify the requirement and the development ...
research
01/24/2019

When Machine Learning Meets Big Data: A Wireless Communication Perspective

We have witnessed an exponential growth in commercial data services, whi...
research
12/02/2014

Semantic HMC for Big Data Analysis

Analyzing Big Data can help corporations to im-prove their efficiency. I...
research
06/15/2019

Online Heterogeneous Mixture Learning for Big Data

We propose the online machine learning for big data analysis with hetero...
research
09/22/2017

Happy Travelers Take Big Pictures: A Psychological Study with Machine Learning and Big Data

In psychology, theory-driven researches are usually conducted with exten...
research
07/05/2019

Networkmetrics unraveled: MBDA in Action

We propose networkmetrics, a new data-driven approach for monitoring, tr...
research
03/24/2019

Deep recommender engine based on efficient product embeddings neural pipeline

Predictive analytics systems are currently one of the most important are...

Please sign up or login with your details

Forgot password? Click here to reset