Introduction to Rare-Event Predictive Modeling for Inferential Statisticians – A Hands-On Application in the Prediction of Breakthrough Patents

03/30/2020
by   Daniel Hain, et al.
0

Recent years have seen a substantial development of quantitative methods, mostly led by the computer science community with the goal to develop better machine learning application, mainly focused on predictive modeling. However, economic, management, and technology forecasting research has up to now been hesitant to apply predictive modeling techniques and workflows. In this paper, we introduce to a machine learning (ML) approach to quantitative analysis geared towards optimizing the predictive performance, contrasting it with standard practices inferential statistics which focus on producing good parameter estimates. We discuss the potential synergies between the two fields against the backdrop of this at first glance, target-incompatibility. We discuss fundamental concepts in predictive modeling, such as out-of-sample model validation, variable and model selection, generalization and hyperparameter tuning procedures. Providing a hands-on predictive modelling for an quantitative social science audience, while aiming at demystifying computer science jargon. We use the example of high-quality patent identification guiding the reader through various model classes and procedures for data pre-processing, modelling and validation. We start of with more familiar easy to interpret model classes (Logit and Elastic Nets), continues with less familiar non-parametric approaches (Classification Trees and Random Forest) and finally presents artificial neural network architectures, first a simple feed-forward and then a deep autoencoder geared towards anomaly detection. Instead of limiting ourselves to the introduction of standard ML techniques, we also present state-of-the-art yet approachable techniques from artificial neural networks and deep learning to predict rare phenomena of interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2020

A review of machine learning applications in wildfire science and management

Artificial intelligence has been applied in wildfire science and managem...
research
09/21/2021

Multiblock-Networks: A Neural Network Analog to Component Based Methods for Multi-Source Data

Training predictive models on datasets from multiple sources is a common...
research
01/11/2023

A prediction and behavioural analysis of machine learning methods for modelling travel mode choice

The emergence of a variety of Machine Learning (ML) approaches for trave...
research
06/20/2021

TinyML: Analysis of Xtensa LX6 microprocessor for Neural Network Applications by ESP32 SoC

In recent decades, Machine Learning (ML) has become extremely important ...
research
04/14/2022

Using Machine Learning for Particle Identification in ALICE

Particle identification (PID) is one of the main strengths of the ALICE ...
research
10/09/2018

Is your Statement Purposeless? Predicting Computer Science Graduation Admission Acceptance based on Statement Of Purpose

We present a quantitative, data-driven machine learning approach to miti...
research
03/24/2023

Predictive modeling for limited distributed targets

Many forecasting applications have a limited distributed target variable...

Please sign up or login with your details

Forgot password? Click here to reset