Theory-guided Data Science: A New Paradigm for Scientific Discovery from Data

12/27/2016
by   Anuj Karpatne, et al.
0

Data science models, although successful in a number of commercial domains, have had limited applicability in scientific problems involving complex physical phenomena. Theory-guided data science (TGDS) is an emerging paradigm that aims to leverage the wealth of scientific knowledge for improving the effectiveness of data science models in enabling scientific discovery. The overarching vision of TGDS is to introduce scientific consistency as an essential component for learning generalizable models. Further, by producing scientifically interpretable models, TGDS aims to advance our scientific understanding by discovering novel domain insights. Indeed, the paradigm of TGDS has started to gain prominence in a number of scientific disciplines such as turbulence modeling, material discovery, quantum chemistry, bio-medical science, bio-marker discovery, climate science, and hydrology. In this paper, we formally conceptualize the paradigm of TGDS and present a taxonomy of research themes in TGDS. We describe several approaches for integrating domain knowledge in different research themes using illustrative examples from different disciplines. We also highlight some of the promising avenues of novel research for realizing the full potential of theory-guided data science.

READ FULL TEXT
research
06/28/2023

Defining data science: a new field of inquiry

Data science is not a science. It is a research paradigm. Its power, sco...
research
10/12/2020

Towards International Relations Data Science: Mining the CIA World Factbook

This paper presents a three-component work. The first component sets the...
research
10/31/2017

Hack Weeks as a model for Data Science Education and Collaboration

Across almost all scientific disciplines, the instruments that record ou...
research
10/10/2022

Neurosymbolic Programming for Science

Neurosymbolic Programming (NP) techniques have the potential to accelera...
research
04/09/2021

INODE: Building an End-to-End Data Exploration System in Practice [Extended Vision]

A full-fledged data exploration system must combine different access mod...
research
07/10/2014

Possibilities of technologization of philosophical knowledge

Article purpose is the analysis of a question of possibility of technolo...
research
04/11/2020

Optimal Learning for Sequential Decisions in Laboratory Experimentation

The process of discovery in the physical, biological and medical science...

Please sign up or login with your details

Forgot password? Click here to reset