Recent theoretical developments in machine learning (ML), complemented by the astounding growth of computational power and the genesis of large data sets, have contributed to the rapid development of artificial intelligent (AI) systems. Even though the key findings required for a general AI system (strong AI) are considered a distant endeavor 
, AI systems designed to solve narrowly defined yet challenging enough problems (weak AI or narrow AI) are often comparable to or exceeding the performance of average humans[2, 3], and in many cases human experts, at these same tasks [4, 5, 6, 7]. These narrow AI solutions offer a great potential for industry to automate, improve, and surpass unaided human productivity. In the rest of this paper, the term AI will refer to this narrow AI, unless stated otherwise. In spite of the great performance potential AI encompasses, user adoption has always been challenging. In many cases user trust becomes a bottleneck towards industry-wide adoption, especially in aerospace, safety, and defence, to name a few. New ways to enhance user trust in AI can directly affect user adoption at a large scale. In addition, if new ways to enhance user trust in AI can take advantage of existing solutions that were developed prior to AI solutions, it is likely to be more resource-friendly and even more attractive to industry.
1.2 Background and Related Work
In pursuit of industry-wide adoption of AI, new areas of research that focus on the trustworthiness of AI have emerged. Trust is a topic of rich content deeply rooted in many historical and philosophical discussions, and is often tied with the study of risk in philosophical research . As we are primarily interested in user adoption of new technology, we are not approaching trust from a philosophical research viewpoint and focus on the aspect of user trust. To many ordinary users, the lack of trust in AI may have originated from the perception of the technology as a ‘black box’. This perception reflects several other profound issues between human users and AI, including lack of understanding of the scientific principles of how the AI is constructed, lack of understanding of the functionality and limitations of ML based systems, and lack of transparency in the AI design process. Even for experts, the lack of straightforward ways of explanation of AI action using domain knowledge can lead to a ‘black box’ perception. Recently there are considerable research efforts on developing AI systems that are easily interpreted by humans, resulting in an emerging research field of eXplainable Artificial Intelligence, or XAI [9, 10, 11, 12]. XAI aims to bridge the gap of trust between AI system and its users by providing explanation of AI systems with the intentions to justify, control, and improve AI actions. Since explanations are subjective to the human observer, this area has also expanded to include psychology, philosophy, and cognitive science of people .
The long term solution to enhancing user trust in AI is to progress theoretical understanding of the underlying principles of machine learning and data science, similar to how we gain trust in systems that are designed based on physical principles. Being a relatively young field, the AI community is beginning to address some of the issues that may hinder user trust, such as lack of empirical rigor of some of the works conducted by the community
. Although questions in the fundamental level of AI and ML research still remains, especially in deep learning, it should be noted that this alone does not provide a direct cause to diminish trustworthiness of AI. There are open and outstanding questions within our mainstream theories of physics, as well as in our own understanding of the human brain and its functionality, and yet we generally trust systems design via physical principles and judgments from biological systems that we do not fully understand.
Other ways to improve user trust and adoption in AI include providing tractable performance improvement compared to non-AI systems in controlled data set and experiments. This is normally achieved by building metrics to quantify performance . However, any quantitative comparison will likely tie the performance to the testing data sets with known data bias being an issue for AI systems . Another practical way to restoring user trust for AI services is by providing supplementary documentation such as a supplier’s declaration of conformity (SDoC) for the AI services provided .
In the above-mentioned efforts to build trustworthy AI systems, one aspect is overlooked: the existing trust of users in systems and solutions constructed using physics-based principles and other domain expert knowledge. In this paper we propose a new perspective to restore trust of AI by leveraging user trust in physical principles and expert knowledge, wherever applicable. By intelligently combining systems based on Artificial Intelligence or Machine Learning with Physical and Expert systems, we show that the overall system can have improved user trust. As the source of the enhanced trust comes from physics, we term this class of AI with enhanced trust the Physics Enhanced Artificial Intelligence (PEAI). In the rest of this paper, we will develop this idea by leveraging well known theories such as portfolio theory  and bias-variance trade-off analysis, and discuss the implications for design of complex systems.
2 Physics Enhanced Artificial Intelligence (PEAI)
2.1 Definitions and Assumptions
In order to discuss the mathematical description of PEAI, we first discuss the basic concepts behind the idea of trust and risk in this context. In general, a user is more likely to trust and adopt a new technology - presented in the form of a model, when it is explainable and has good performance. We therefore assume that user trust, for any given individual, is composed of three properties of the model:
Interpretability or Explainablity ()
Performance Accuracy ()
Performance Consistency ()
In many practical applications where the task at hand is complex, AI models learned from data have lower interpretability compared with physics-based or expert-based models, and their consistency can be unknown or poor depending on how they are trained and the training data provided. However, they tend to be more accurate. Physics-based models are often quite interpretable and consistent, but are often not as accurate as the AI derived models. Expert models, while often accurate, may not be as consistent or explainable as physics-based models. We argue that combining models or knowledge from different domains of AI, physics, and expert for narrowly defined tasks, will yield a more trustworthy model than any one model or knowledge base they are composed from. Interpretability is subjective, and it is beyond the scope of our discussion of PEAI. By assuming that the interpretability of the models in question is constant based upon a given individual, we maximize user trust by maximizing the accuracy for a given model consistency.
As an attempt to qualify the relationship between trust and a given model, we express user trust as
where , , and is a function of performance qualities that are assumed to belong to partially ordered domains. Further, we assume that
such that we can optimize . PEAI aims to develop a model to solve a narrow problem which consists of a set of rules and requirements that is described by task . To avoid the discussion of trivial and unreasonable situations, we assume that is sufficiently complex that the optimal solution is not known, and will require near infinite resources to identify. In order to solve , a model, or a solution, is constructed, where consists of inputs from sensor measurements and its outputs. Let be the universal set of solutions. As models can be constructed using different methods, we define the following:
We also assume that there exist multiple competing models from each of the above defined sets. A PEAI system (or model), , belongs to the set
For a complex task , it is highly unlikely that one arrives at exactly the same mathematical model when using different modeling methods, therefore the intersections between any two sets of model sets are expected to be empty, i.e, . Under these situations, reduces to
We will examine for the remainder of this paper and discuss strategies for constructing . For a complex problem, models are expensive to make, and no one model is perfect. We consider composing finite number of models. Finally we assume that all models that are examined in are constructed in good faith and aim to provide the best results possible given their application and method.
2.2 Construction of PEAI
There are two strategies to make a PEAI algorithm. The first is a composite model output approach - take models from the sets , , and , and combine their outputs to form a new composite model in . The composite model approach will be analyzed using an analogy to classical mean-variance portfolio theory. The second is a hybridization model approach - modify the form of constructing the model by applying an intelligent constraint using information from another domain, generating a model in . We will analyze this hybridized model by using classical bias-variance trade-off analysis. We show that composed models using the above strategies yield a more consistent model for a desired accuracy.
2.2.1 Composite PEAI using Mean-Variance Portfolio Theory
The 1990 Nobel prize was awarded to Harry Markowitz for his 1952 ‘Portfolio Selection’ essay . His work laid the mathematical foundation of diversification by demonstrating that the combination of risky assets is less risky than any single asset. By treating the available models in as risky assets, we can maximize user trust by minimizing the variance (risk) of the composite model. While this is conceptually a simple idea, it has profound impact on the understanding of ML ensembles and composite model techniques.
Assume there exists a function : that can be evaluated on each of the models that gives a meaningful representation of the model performance. Each model , has the output . Without a loss of generality, we further assume larger value of indicate better performance. For the combined model, we have
where is the relative weight given to model with , and . Therefore is composed of all models. Using the distributive property of the expectation denoted by
, we solve for the first moment of,
where , , and . We will assume that each element of is not equal to each-other, as the models are expected to give different expected values.
The second moment of ,
where is the covariance matrix. We will assume that exists, as a result of the models each being unique rather than a composition of other models.
By minimizing the variance of the combined model, we are reducing risk. As the combined model is more likely to give a consistent answer. The minimization is subject to the following previously defined constraints: and , where .
To find the optimized weights, we construct the Largrangian, :
Taking the partial derivatives of with respect to , , and :
By rearranging equation 18,
Define new variables
Finally by substituting the above results into equation 16,
It is important to note a few properties about , , and . Since is positive definite, and , and by the Cauchy-Schwarz inequality .
Equation 26 allows us to solve for the minimal risk for a desired mean value. For each being unique, the models not being perfectly correlated, and , the feasible region for portfolio theory can be shown to be a two-dimensional surface that is convex to the left, and is represented on the vs. plane, see Figure 1. In this figure, the class of optimal combined models lines on the thick black line between points and . A sub-optimal model found by combining model outputs from various domains, as shown by , lies within the region with a dotted line boundary, and an unfeasible model lies outside this boundary, as shown by point .
An example of the feasible point could be constructed and interpreted under the following conditions: If one were to make a model using a linear combination of the outputs of all models, and assign each model the weights of , then one would construct the equivalent of an ensemble of models using majority vote to make a decision. In practice this has shown to increase the accuracy and reduce the variance of the model ; however, here we show that there exists a set of weights for each of these models that would minimize the risk of the prediction. Therefore, this ensemble, majority vote model is likely to be a sub-optimal solution for a given performance.
In the case where models come from different domains, it is more likely that the models are going to have different , be less correlated with each other, and have different properties of their predictions. For example, the models in are more likely to be more accurate than those in , but have a higher variance. Therefore when these models are combined the composite model is able to obtain a more optimal performance than any one given model.
2.2.2 Hybridization PEAI using Bias-Variance Trade-Off Analysis
Hybridized PEAI models can be shown to have lower risk and enhanced user trust. First we derive the expressions of bias and variance in a model. Here we will assume that the model can be represented as a function of the inputs plus some error:
where is noise with zero mean and variance , such that and . Also let be a deterministic approximation the function , and .
Therefore we can express the mean squared error as a function of the , , and :
Normally, a physics models tend to have higher bias, but low variance. On the other hand, AI models tend to have a high variance and low bias. By placing physics constraints on the AI model during learning or during run-time, one places a bias on the new hybrid model, limiting the output space. This will cause the model to have a larger bias, and if done correctly will dramatically reduce the variance of the hybrid model.This concept can be shown graphically by model complexity vs. error diagram in Figure 2. By intelligently introducing physics based constraints to AI models or vice versa, we can arrive at models that have lower total error.
3 Implications of PEAI
Human learning builds on human observations and empirical evidence of the surrounding world, and this accumulated and learned knowledge is passed on, resulting in a systems design or model that is physics-based, as shown by the architecture in Figure 3. Similarly, data driven AI approaches use ML to arrive at a design or model based on the collected data. The combined model allows solutions of AI to be constrained by physical solution and expert knowledge, which enhances performance and trust. In addition, user trust in PEAI can be further enhanced by having a human supervisor in the loop, where the supervisor monitors the AI and provides feedback to the system that can improve its performance.
It is interesting to point out that many have already worked on the class of PEAI systems, though the direct increase in user trust was not the motivation. For example, AI system with physical constrains is shown best by the work Physics informed deep learning [21, 22]
, where differential equation constraints that can represent a physical model is combined with a neural network to form a PEAI. In the field of predictive turbulence modeling, Physics-informed ML framework has been proposed, where the functional form of the Reynolds stress discrepancy in Reynolds-averaged Navier-Stokes (RANS) is learned directly based on available data. In the area of real-time vision-based event monitoring at industrial sites, Physical constraints are added to AI models in order to improve performance and enhance user trust.
Being able to quickly identify and design AI solutions that have potential for wide-range industrial adoption is challenging. The AI solutions that are more likely to receive wider adoption are ones that can earn trust from users, and deliver an improved productivity at the same time. By explicitly pointing out the connection between enhanced user trust and combining models from different domains, it is suggested that one should always seek to combine AI models with prior models, if available.
Physics enhanced AI (PEAI) is a class of model that is formed by intelligently combining models from the domains of artificial intelligence or machine learning with physical and expert models. It was shown that by doing so, model risk is reduced, resulting in a more “trustworthy” model than any one model from a single domain. PEAI is shown as a solution to improve user adoption of an AI which solves a complex yet narrow enough problem.
-  V. C. Müller and N. Bostrom, Future Progress in Artificial Intelligence: A Survey of Expert Opinion, pp. 555–572. Cham: Springer International Publishing, 2016.
-  Y. Qian, Y. Yongxin, L. Feng, S. Yi-Zhe, and X. Tao, “Sketch-a-net: A deep neural network that beats humans,” Int J Comput Vis, vol. 122, pp. 411–425, 2017.
-  C. Lu and X. Tang, “Surpassing human-level face verification performance on lfw with gaussianface,” in AAAI, 2015.
K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” in
The IEEE International Conference on Computer Vision (ICCV), December 2015.
-  D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, “Mastering the game of go with deep neural networks and tree search,” Nature, vol. 529, p. 484–489, 2016.
-  Y. M. Assael, B. Shillingford, S. Whiteson, and N. de Freitas, “Lipnet: Sentence-level lipreading,” CoRR, vol. abs/1611.01599, 2016.
-  N. Brown and T. Sandholm, “Superhuman ai for heads-up no-limit poker: Libratus beats top professionals,” Science, vol. 359, no. 6374, pp. 418–424, 2018.
-  S. Roeser, R. Hillerbrand, P. Sandin, and M. Peterson, eds., Risk and Trust, pp. 858–873. Dordrecht Heidelberg London New York: Springer, 2012.
-  A. Amina and M. Berrada, “Peeking inside the black-box: A survey on explainable artificial intelligence (xai),” IEEE Access, vol. 6, pp. 52138–52160, 2018.
-  S. Mohseni, N. Zarei, and E. D. Ragan, “A survey of evaluation methods and measures for interpretable machine learning,” CoRR, vol. abs/1811.11839, 2018.
-  F. K. Došilović, M. Brčić, and N. Hlupić, “Explainable artificial intelligence: A survey,” in 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pp. 0210–0215, May 2018.
-  O. Biran and C. V. Cotton, “Explanation and justification in machine learning : A survey or,” in IJCAI-17 Workshop on Explainable AI (XAI) Proceedings, 2017.
-  T. Miller, “Explanation in artificial intelligence: Insights from the social sciences,” CoRR, vol. abs/1706.07269, 2017.
-  D. Sculley, J. Snoek, A. Wiltschko, and A. Rahimi, “Winner’s curse? on pace, progress, and empirical rigor,” ICLR 2018 Workshop, 2018.
-  D. Castelvecchi, “Can we open the black box of AI?,” Nature, vol. 538, pp. 20–23, Oct. 2016.
H. Jiang, B. Kim, and M. R. Gupta, “To trust or not to trust a classifier,” inNeurIPS, 2018.
-  J. Heinrich and N. Ofir, “Identifying and correcting label bias in machine learning,” arXiv, vol. 1901.04966, 2019.
-  M. Hind, S. Mehta, A. Mojsilovic, R. Nair, K. N. Ramamurthy, A. Olteanu, and K. R. Varshney, “Increasing trust in ai services through supplier’s declarations of conformity,” arXiv preprint arXiv:1808.07261, 2018.
-  H. Markowitz, “Portfolio selection,” The journal of finance, vol. 7, no. 1, pp. 77–91, 1952.
-  F. Chollet, Deep Learning with Python. Greenwich, CT, USA: Manning Publications Co., 1st ed., 2017.
-  M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics informed deep learning (part ii): Data-driven discovery of nonlinear partial differential equations,” arXiv, vol. 1711.10566, 2017.
-  M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics informed deep learning (part i): Data-driven solutions of nonlinear partial differential equations,” arXiv, vol. 1711.10561, 2017.
-  J.-X. Wang, J. Wu, J. Ling, G. Iaccarino, and H. Xiao, “A comprehensive physics-informed machine learning framework for predictive turbulence modeling,” arXiv preprint arXiv:1701.07102, 2017.