SmartEDA: An R Package for Automated Exploratory Data Analysis

03/12/2019
by   Sayan Putatunda, et al.
0

This paper introduces SmartEDA, which is an R package for performing Exploratory data analysis (EDA). EDA is generally the first step that one needs to perform before developing any machine learning or statistical models. The goal of EDA is to help someone perform the initial investigation to know more about the data via descriptive statistics and visualizations. In other words, the objective of EDA is to summarize and explore the data. The need for EDA became one of the factors that led to the development of various statistical computing packages over the years including the R programming language that is a very popular and currently the most widely used software for statistical computing. However, EDA is a very tedious task, requires some manual effort and some of the open source packages available in R are not just upto the mark. In this paper, we propose a new open source package i.e. SmartEDA for R to address the need for automation of exploratory data analysis. We discuss the various features of SmartEDA and illustrate some of its applications for generating actionable insights using a couple of real-world datasets. We also perform a comparative study of SmartEDA with respect to other packages available for exploratory data analysis in the Comprehensive R Archive Network (CRAN).

READ FULL TEXT
research
03/27/2019

The Landscape of R Packages for Automated Exploratory Data Analysis

The increasing availability of large but noisy data sets with a large nu...
research
10/19/2020

Modelling Complex Survey Data Using R, SAS, SPSS and Stata: A Comparison Using CLSA Datasets

The R software has become popular among researchers due to its flexibili...
research
06/15/2022

Current state and prospects of R-packages for the design of experiments

Re-running an experiment is generally costly and in some cases impossibl...
research
05/05/2023

Scope Restriction for Scalable Real-Time Railway Rescheduling: An Exploratory Study

With the aim to stimulate future research, we describe an exploratory st...
research
11/19/2020

Categorical exploratory data analysis on goodness-of-fit issues

If the aphorism "All models are wrong"- George Box, continues to be true...
research
07/03/2019

bayes4psy – an Open Source R Package for Bayesian Statistics in Psychology

Research in psychology generates interesting data sets and unique statis...
research
11/01/2019

Goals, Process, and Challenges of Exploratory Data Analysis: An Interview Study

How do analysis goals and context affect exploratory data analysis (EDA)...

Please sign up or login with your details

Forgot password? Click here to reset