Physics Enhanced Artificial Intelligence

03/11/2019
by   Patrick O'Driscoll, et al.
Rebellion Photonics
0

We propose that intelligently combining models from the domains of Artificial Intelligence or Machine Learning with Physical and Expert models will yield a more "trustworthy" model than any one model from a single domain, given a complex and narrow enough problem. Based on mean-variance portfolio theory and bias-variance trade-off analysis, we prove combining models from various domains produces a model that has lower risk, increasing user trust. We call such combined models - physics enhanced artificial intelligence (PEAI), and suggest use cases for PEAI.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

08/25/2012

Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (2005)

This is the Proceedings of the Twenty-First Conference on Uncertainty in...
09/09/2019

Subjectivity Learning Theory towards Artificial General Intelligence

The construction of artificial general intelligence (AGI) was a long-ter...
04/20/2021

Introducing the Partitioned Equivalence Test: Artificial Intelligence in Automatic Passenger Counting Validation

Automatic passenger counting (APC) in public transport has been introduc...
02/09/2018

Narrow Artificial Intelligence with Machine Learning for Real-Time Estimation of a Mobile Agents Location Using Hidden Markov Models

We propose to use a supervised machine learning technique to track the l...
01/02/2019

Natively Interpretable Machine Learning and Artificial Intelligence: Preliminary Results and Future Directions

Machine learning models have become more and more complex in order to be...
01/14/2017

Minimally Naturalistic Artificial Intelligence

The rapid advancement of machine learning techniques has re-energized re...
05/29/2019

Asymptotically Unambitious Artificial General Intelligence

General intelligence, the ability to solve arbitrary solvable problems, ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

1.1 Motivation

Recent theoretical developments in machine learning (ML), complemented by the astounding growth of computational power and the genesis of large data sets, have contributed to the rapid development of artificial intelligent (AI) systems. Even though the key findings required for a general AI system (strong AI) are considered a distant endeavor [1]

, AI systems designed to solve narrowly defined yet challenging enough problems (weak AI or narrow AI) are often comparable to or exceeding the performance of average humans

[2, 3], and in many cases human experts, at these same tasks [4, 5, 6, 7]. These narrow AI solutions offer a great potential for industry to automate, improve, and surpass unaided human productivity. In the rest of this paper, the term AI will refer to this narrow AI, unless stated otherwise. In spite of the great performance potential AI encompasses, user adoption has always been challenging. In many cases user trust becomes a bottleneck towards industry-wide adoption, especially in aerospace, safety, and defence, to name a few. New ways to enhance user trust in AI can directly affect user adoption at a large scale. In addition, if new ways to enhance user trust in AI can take advantage of existing solutions that were developed prior to AI solutions, it is likely to be more resource-friendly and even more attractive to industry.

1.2 Background and Related Work

In pursuit of industry-wide adoption of AI, new areas of research that focus on the trustworthiness of AI have emerged. Trust is a topic of rich content deeply rooted in many historical and philosophical discussions, and is often tied with the study of risk in philosophical research [8]. As we are primarily interested in user adoption of new technology, we are not approaching trust from a philosophical research viewpoint and focus on the aspect of user trust. To many ordinary users, the lack of trust in AI may have originated from the perception of the technology as a ‘black box’. This perception reflects several other profound issues between human users and AI, including lack of understanding of the scientific principles of how the AI is constructed, lack of understanding of the functionality and limitations of ML based systems, and lack of transparency in the AI design process. Even for experts, the lack of straightforward ways of explanation of AI action using domain knowledge can lead to a ‘black box’ perception. Recently there are considerable research efforts on developing AI systems that are easily interpreted by humans, resulting in an emerging research field of eXplainable Artificial Intelligence, or XAI [9, 10, 11, 12]. XAI aims to bridge the gap of trust between AI system and its users by providing explanation of AI systems with the intentions to justify, control, and improve AI actions. Since explanations are subjective to the human observer, this area has also expanded to include psychology, philosophy, and cognitive science of people [13].

The long term solution to enhancing user trust in AI is to progress theoretical understanding of the underlying principles of machine learning and data science, similar to how we gain trust in systems that are designed based on physical principles. Being a relatively young field, the AI community is beginning to address some of the issues that may hinder user trust, such as lack of empirical rigor of some of the works conducted by the community

[14]

. Although questions in the fundamental level of AI and ML research still remains, especially in deep learning

[15], it should be noted that this alone does not provide a direct cause to diminish trustworthiness of AI. There are open and outstanding questions within our mainstream theories of physics, as well as in our own understanding of the human brain and its functionality, and yet we generally trust systems design via physical principles and judgments from biological systems that we do not fully understand.

Other ways to improve user trust and adoption in AI include providing tractable performance improvement compared to non-AI systems in controlled data set and experiments. This is normally achieved by building metrics to quantify performance [16]. However, any quantitative comparison will likely tie the performance to the testing data sets with known data bias being an issue for AI systems [17]. Another practical way to restoring user trust for AI services is by providing supplementary documentation such as a supplier’s declaration of conformity (SDoC) for the AI services provided [18].

In the above-mentioned efforts to build trustworthy AI systems, one aspect is overlooked: the existing trust of users in systems and solutions constructed using physics-based principles and other domain expert knowledge. In this paper we propose a new perspective to restore trust of AI by leveraging user trust in physical principles and expert knowledge, wherever applicable. By intelligently combining systems based on Artificial Intelligence or Machine Learning with Physical and Expert systems, we show that the overall system can have improved user trust. As the source of the enhanced trust comes from physics, we term this class of AI with enhanced trust the Physics Enhanced Artificial Intelligence (PEAI). In the rest of this paper, we will develop this idea by leveraging well known theories such as portfolio theory [19] and bias-variance trade-off analysis, and discuss the implications for design of complex systems.

2 Physics Enhanced Artificial Intelligence (PEAI)

2.1 Definitions and Assumptions

In order to discuss the mathematical description of PEAI, we first discuss the basic concepts behind the idea of trust and risk in this context. In general, a user is more likely to trust and adopt a new technology - presented in the form of a model, when it is explainable and has good performance. We therefore assume that user trust, for any given individual, is composed of three properties of the model:

  1. Interpretability or Explainablity ()

  2. Performance Accuracy ()

  3. Performance Consistency ()

In many practical applications where the task at hand is complex, AI models learned from data have lower interpretability compared with physics-based or expert-based models, and their consistency can be unknown or poor depending on how they are trained and the training data provided. However, they tend to be more accurate. Physics-based models are often quite interpretable and consistent, but are often not as accurate as the AI derived models. Expert models, while often accurate, may not be as consistent or explainable as physics-based models. We argue that combining models or knowledge from different domains of AI, physics, and expert for narrowly defined tasks, will yield a more trustworthy model than any one model or knowledge base they are composed from. Interpretability is subjective, and it is beyond the scope of our discussion of PEAI. By assuming that the interpretability of the models in question is constant based upon a given individual, we maximize user trust by maximizing the accuracy for a given model consistency.

As an attempt to qualify the relationship between trust and a given model, we express user trust as

(1)

where , , and is a function of performance qualities that are assumed to belong to partially ordered domains. Further, we assume that

(2)

and

(3)

such that we can optimize . PEAI aims to develop a model to solve a narrow problem which consists of a set of rules and requirements that is described by task . To avoid the discussion of trivial and unreasonable situations, we assume that is sufficiently complex that the optimal solution is not known, and will require near infinite resources to identify. In order to solve , a model, or a solution, is constructed, where consists of inputs from sensor measurements and its outputs. Let be the universal set of solutions. As models can be constructed using different methods, we define the following:

(4)
(5)
(6)
(7)
(8)
(9)
(10)

We also assume that there exist multiple competing models from each of the above defined sets. A PEAI system (or model), , belongs to the set

(11)
(12)

For a complex task , it is highly unlikely that one arrives at exactly the same mathematical model when using different modeling methods, therefore the intersections between any two sets of model sets are expected to be empty, i.e, . Under these situations, reduces to

(13)

We will examine for the remainder of this paper and discuss strategies for constructing . For a complex problem, models are expensive to make, and no one model is perfect. We consider composing finite number of models. Finally we assume that all models that are examined in are constructed in good faith and aim to provide the best results possible given their application and method.

2.2 Construction of PEAI

There are two strategies to make a PEAI algorithm. The first is a composite model output approach - take models from the sets , , and , and combine their outputs to form a new composite model in . The composite model approach will be analyzed using an analogy to classical mean-variance portfolio theory. The second is a hybridization model approach - modify the form of constructing the model by applying an intelligent constraint using information from another domain, generating a model in . We will analyze this hybridized model by using classical bias-variance trade-off analysis. We show that composed models using the above strategies yield a more consistent model for a desired accuracy.

2.2.1 Composite PEAI using Mean-Variance Portfolio Theory

The 1990 Nobel prize was awarded to Harry Markowitz for his 1952 ‘Portfolio Selection’ essay [19]. His work laid the mathematical foundation of diversification by demonstrating that the combination of risky assets is less risky than any single asset. By treating the available models in as risky assets, we can maximize user trust by minimizing the variance (risk) of the composite model. While this is conceptually a simple idea, it has profound impact on the understanding of ML ensembles and composite model techniques.

Assume there exists a function : that can be evaluated on each of the models that gives a meaningful representation of the model performance. Each model , has the output . Without a loss of generality, we further assume larger value of indicate better performance. For the combined model, we have

(14)

where is the relative weight given to model with , and . Therefore is composed of all models. Using the distributive property of the expectation denoted by

, we solve for the first moment of

,

(15)

where , , and . We will assume that each element of is not equal to each-other, as the models are expected to give different expected values.

The second moment of ,

(16)

where is the covariance matrix. We will assume that exists, as a result of the models each being unique rather than a composition of other models.

By minimizing the variance of the combined model, we are reducing risk. As the combined model is more likely to give a consistent answer. The minimization is subject to the following previously defined constraints: and , where .

To find the optimized weights, we construct the Largrangian, :

(17)

Taking the partial derivatives of with respect to , , and :

(18)
(19)
(20)

By rearranging equation 18,

(21)

where is the optimized weights that minimize . From equations 19 and 20:

(22)
(23)

Define new variables

(24)

By substitution of the definitions from equation 24 into equations 22 and 23. We can solve for and in terms of , , and :

(25)

Finally by substituting the above results into equation 16,

(26)

It is important to note a few properties about , , and . Since is positive definite, and , and by the Cauchy-Schwarz inequality .

Equation 26 allows us to solve for the minimal risk for a desired mean value. For each being unique, the models not being perfectly correlated, and , the feasible region for portfolio theory can be shown to be a two-dimensional surface that is convex to the left, and is represented on the vs. plane, see Figure 1. In this figure, the class of optimal combined models lines on the thick black line between points and . A sub-optimal model found by combining model outputs from various domains, as shown by , lies within the region with a dotted line boundary, and an unfeasible model lies outside this boundary, as shown by point .

Figure 1: Example of a feasible region of composite models from portfolio analysis. The optimal combined models available exist on the curve between points and and is marked by the thicker line.

An example of the feasible point could be constructed and interpreted under the following conditions: If one were to make a model using a linear combination of the outputs of all models, and assign each model the weights of , then one would construct the equivalent of an ensemble of models using majority vote to make a decision. In practice this has shown to increase the accuracy and reduce the variance of the model [20]; however, here we show that there exists a set of weights for each of these models that would minimize the risk of the prediction. Therefore, this ensemble, majority vote model is likely to be a sub-optimal solution for a given performance.

In the case where models come from different domains, it is more likely that the models are going to have different , be less correlated with each other, and have different properties of their predictions. For example, the models in are more likely to be more accurate than those in , but have a higher variance. Therefore when these models are combined the composite model is able to obtain a more optimal performance than any one given model.

2.2.2 Hybridization PEAI using Bias-Variance Trade-Off Analysis

Hybridized PEAI models can be shown to have lower risk and enhanced user trust. First we derive the expressions of bias and variance in a model. Here we will assume that the model can be represented as a function of the inputs plus some error:

(27)

where is noise with zero mean and variance , such that and . Also let be a deterministic approximation the function , and .

Therefore we can express the mean squared error as a function of the , , and :

(28)

Normally, a physics models tend to have higher bias, but low variance. On the other hand, AI models tend to have a high variance and low bias. By placing physics constraints on the AI model during learning or during run-time, one places a bias on the new hybrid model, limiting the output space. This will cause the model to have a larger bias, and if done correctly will dramatically reduce the variance of the hybrid model.This concept can be shown graphically by model complexity vs. error diagram in Figure 2. By intelligently introducing physics based constraints to AI models or vice versa, we can arrive at models that have lower total error.

Figure 2: Model complexity vs. Total Error. The optimal models are likely to belong to the class of PEAI models.

3 Implications of PEAI

Human learning builds on human observations and empirical evidence of the surrounding world, and this accumulated and learned knowledge is passed on, resulting in a systems design or model that is physics-based, as shown by the architecture in Figure 3. Similarly, data driven AI approaches use ML to arrive at a design or model based on the collected data. The combined model allows solutions of AI to be constrained by physical solution and expert knowledge, which enhances performance and trust. In addition, user trust in PEAI can be further enhanced by having a human supervisor in the loop, where the supervisor monitors the AI and provides feedback to the system that can improve its performance.

Figure 3: Physics enhanced AI and its relationships with machine learning and human learning

It is interesting to point out that many have already worked on the class of PEAI systems, though the direct increase in user trust was not the motivation. For example, AI system with physical constrains is shown best by the work Physics informed deep learning [21, 22]

, where differential equation constraints that can represent a physical model is combined with a neural network to form a PEAI. In the field of predictive turbulence modeling, Physics-informed ML framework has been proposed

[23], where the functional form of the Reynolds stress discrepancy in Reynolds-averaged Navier-Stokes (RANS) is learned directly based on available data. In the area of real-time vision-based event monitoring at industrial sites, Physical constraints are added to AI models in order to improve performance and enhance user trust.

Being able to quickly identify and design AI solutions that have potential for wide-range industrial adoption is challenging. The AI solutions that are more likely to receive wider adoption are ones that can earn trust from users, and deliver an improved productivity at the same time. By explicitly pointing out the connection between enhanced user trust and combining models from different domains, it is suggested that one should always seek to combine AI models with prior models, if available.

4 Conclusions

Physics enhanced AI (PEAI) is a class of model that is formed by intelligently combining models from the domains of artificial intelligence or machine learning with physical and expert models. It was shown that by doing so, model risk is reduced, resulting in a more “trustworthy” model than any one model from a single domain. PEAI is shown as a solution to improve user adoption of an AI which solves a complex yet narrow enough problem.

References